Re: [PATCH] drm/dbi: Print errors for mipi_dbi_command()

2021-07-01 Thread Sam Ravnborg
Hi Linus,

On Fri, Jul 02, 2021 at 12:25:18AM +0200, Linus Walleij wrote:
> The macro mipi_dbi_command() does not report errors unless you wrap it
> in another macro to do the error reporting.
> 
> Report a rate-limited error so we know what is going on.
> 
> Drop the only user in DRM using mipi_dbi_command() and actually checking
> the error explicitly, let it use mipi_dbi_command_buf() directly
> instead.
> 
> After this any code wishing to send command arrays can rely on
> mipi_dbi_command() providing an appropriate error message if something
> goes wrong.
> 
> Suggested-by: Noralf Trønnes 
> Suggested-by: Douglas Anderson 
> Signed-off-by: Linus Walleij 
> ---
>  drivers/gpu/drm/drm_mipi_dbi.c | 2 +-
>  include/drm/drm_mipi_dbi.h | 5 -
>  2 files changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_mipi_dbi.c b/drivers/gpu/drm/drm_mipi_dbi.c
> index 3854fb9798e9..c7c1b75df190 100644
> --- a/drivers/gpu/drm/drm_mipi_dbi.c
> +++ b/drivers/gpu/drm/drm_mipi_dbi.c
> @@ -645,7 +645,7 @@ static int mipi_dbi_poweron_reset_conditional(struct 
> mipi_dbi_dev *dbidev, bool
>   return 1;
>  
>   mipi_dbi_hw_reset(dbi);
> - ret = mipi_dbi_command(dbi, MIPI_DCS_SOFT_RESET);
> + ret = mipi_dbi_command_buf(dbi, MIPI_DCS_SOFT_RESET, NULL, 0);
>   if (ret) {
>   DRM_DEV_ERROR(dev, "Failed to send reset command (%d)\n", ret);
>   if (dbidev->regulator)
I do not see the value in this change??
There are many other mipi_dbi_command() users and the error return
continues to be checked?!??!


> diff --git a/include/drm/drm_mipi_dbi.h b/include/drm/drm_mipi_dbi.h
> index f543d6e3e822..2057ad32760c 100644
> --- a/include/drm/drm_mipi_dbi.h
> +++ b/include/drm/drm_mipi_dbi.h
> @@ -183,7 +183,10 @@ int mipi_dbi_buf_copy(void *dst, struct drm_framebuffer 
> *fb,
>  #define mipi_dbi_command(dbi, cmd, seq...) \
>  ({ \
>   const u8 d[] = { seq }; \
> - mipi_dbi_command_stackbuf(dbi, cmd, d, ARRAY_SIZE(d)); \
> + int ret; \
> + ret = mipi_dbi_command_stackbuf(dbi, cmd, d, ARRAY_SIZE(d)); \
> + if (ret) \
> + pr_err_ratelimited("MIPI DBI: error %d when sending command\n", 
> ret); \
>  })
Coud this be more informative if the spi device was printed, it is
available? Maybe in 99% of the cases there is only one user anyway so it
will not help?

Sam


Re: [PATCH] drm/panel: panel-simple: Fix proper bpc for ytc700tlag_05_201c

2021-07-01 Thread Jagan Teki
Hi Sam and Thierry,

On Tue, May 25, 2021 at 12:12 AM Jagan Teki  wrote:
>
> ytc700tlag_05_201c panel support 8 bpc not 6 bpc as per
> recent testing in i.MX8MM platform.
>
> Fix it.
>
> Signed-off-by: Jagan Teki 
> ---
>  drivers/gpu/drm/panel/panel-simple.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/panel/panel-simple.c 
> b/drivers/gpu/drm/panel/panel-simple.c
> index 9be050ab372f..6f4151729fb7 100644
> --- a/drivers/gpu/drm/panel/panel-simple.c
> +++ b/drivers/gpu/drm/panel/panel-simple.c
> @@ -4164,7 +4164,7 @@ static const struct drm_display_mode 
> yes_optoelectronics_ytc700tlag_05_201c_mode
>  static const struct panel_desc yes_optoelectronics_ytc700tlag_05_201c = {
> .modes = _optoelectronics_ytc700tlag_05_201c_mode,
> .num_modes = 1,
> -   .bpc = 6,
> +   .bpc = 8,

Can you pick this, if all okay.

Jagan.


Re: [PATCH] drm/panel: Fix up DT bindings for Samsung lms397kf04

2021-07-01 Thread Sam Ravnborg
Hi Linus,

On Thu, Jul 01, 2021 at 11:36:18PM +0200, Linus Walleij wrote:
> Improve the bindings and make them more usable:
> 
> - Pick in spi-cpha and spi-cpol from the SPI node parent,
>   this will specify that we are "type 3" in the device tree
>   rather than hardcoding it in the operating system.
> - Drop the u32 ref from the SPI frequency: comes in from
>   the SPI host bindings.
> - Make spi-cpha, spi-cpol and port compulsory.
> - Update the example with a real-world SPI controller,
>   spi-gpio.
> 
> Cc: Douglas Anderson 
> Cc: Noralf Trønnes 
> Cc: devicet...@vger.kernel.org
> Signed-off-by: Linus Walleij 
Looks good,
Reviewed-by: Sam Ravnborg 


Re: [PATCH 0/4] mgag200: Various cleanups

2021-07-01 Thread Thomas Zimmermann

Hi Sam

Am 01.07.21 um 19:58 schrieb Sam Ravnborg:

Hi Thomas,

On Thu, Jul 01, 2021 at 02:43:12PM +0200, Thomas Zimmermann wrote:

Cleanup several nits in the driver's init code. Also move constant
data into the RO data segment. No functional changes.

Tested on mgag200 HW.

Thomas Zimmermann (4):
   drm/mgag200: Don't pass flags to drm_dev_register()
   drm/mgag200: Inline mgag200_device_init()

This patch drop a redundant error message too - it had helped me if the
changelog had said so but whatever.


Sure, I'll add it to the log.




   drm/mgag200: Extract device type and flags in mgag200_pci_probe()
   drm/mgag200: Constify LUT for programming bpp


Full serie is:
Acked-by: Sam Ravnborg 


Thanks for the review.

Best regards
Thomas



Sam



--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer



OpenPGP_signature
Description: OpenPGP digital signature


Re: [PATCH 2/2] drm/vc4: hdmi: Convert to gpiod

2021-07-01 Thread Nathan Chancellor
On Mon, May 24, 2021 at 03:18:52PM +0200, Maxime Ripard wrote:
> The new gpiod interface takes care of parsing the GPIO flags and to
> return the logical value when accessing an active-low GPIO, so switching
> to it simplifies a lot the driver.
> 
> Signed-off-by: Maxime Ripard 
> ---
>  drivers/gpu/drm/vc4/vc4_hdmi.c | 24 +++-
>  drivers/gpu/drm/vc4/vc4_hdmi.h |  3 +--
>  2 files changed, 8 insertions(+), 19 deletions(-)
> 
> diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c
> index ccc6c8079dc6..34622c59f6a7 100644
> --- a/drivers/gpu/drm/vc4/vc4_hdmi.c
> +++ b/drivers/gpu/drm/vc4/vc4_hdmi.c
> @@ -159,10 +159,9 @@ vc4_hdmi_connector_detect(struct drm_connector 
> *connector, bool force)
>   struct vc4_hdmi *vc4_hdmi = connector_to_vc4_hdmi(connector);
>   bool connected = false;
>  
> - if (vc4_hdmi->hpd_gpio) {
> - if (gpio_get_value_cansleep(vc4_hdmi->hpd_gpio) ^
> - vc4_hdmi->hpd_active_low)
> - connected = true;
> + if (vc4_hdmi->hpd_gpio &&
> + gpiod_get_value_cansleep(vc4_hdmi->hpd_gpio)) {
> + connected = true;
>   } else if (drm_probe_ddc(vc4_hdmi->ddc)) {
>   connected = true;
>   } else if (HDMI_READ(HDMI_HOTPLUG) & VC4_HDMI_HOTPLUG_CONNECTED) {
> @@ -1993,7 +1992,6 @@ static int vc4_hdmi_bind(struct device *dev, struct 
> device *master, void *data)
>   struct vc4_hdmi *vc4_hdmi;
>   struct drm_encoder *encoder;
>   struct device_node *ddc_node;
> - u32 value;
>   int ret;
>  
>   vc4_hdmi = devm_kzalloc(dev, sizeof(*vc4_hdmi), GFP_KERNEL);
> @@ -2031,18 +2029,10 @@ static int vc4_hdmi_bind(struct device *dev, struct 
> device *master, void *data)
>   /* Only use the GPIO HPD pin if present in the DT, otherwise
>* we'll use the HDMI core's register.
>*/
> - if (of_find_property(dev->of_node, "hpd-gpios", )) {
> - enum of_gpio_flags hpd_gpio_flags;
> -
> - vc4_hdmi->hpd_gpio = of_get_named_gpio_flags(dev->of_node,
> -  "hpd-gpios", 0,
> -  _gpio_flags);
> - if (vc4_hdmi->hpd_gpio < 0) {
> - ret = vc4_hdmi->hpd_gpio;
> - goto err_put_ddc;
> - }
> -
> - vc4_hdmi->hpd_active_low = hpd_gpio_flags & OF_GPIO_ACTIVE_LOW;
> + vc4_hdmi->hpd_gpio = devm_gpiod_get_optional(dev, "hpd", GPIOD_IN);
> + if (IS_ERR(vc4_hdmi->hpd_gpio)) {
> + ret = PTR_ERR(vc4_hdmi->hpd_gpio);
> + goto err_put_ddc;
>   }
>  
>   vc4_hdmi->disable_wifi_frequencies =
> diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.h b/drivers/gpu/drm/vc4/vc4_hdmi.h
> index 060bcaefbeb5..2688a55461d6 100644
> --- a/drivers/gpu/drm/vc4/vc4_hdmi.h
> +++ b/drivers/gpu/drm/vc4/vc4_hdmi.h
> @@ -146,8 +146,7 @@ struct vc4_hdmi {
>   /* VC5 Only */
>   void __iomem *rm_regs;
>  
> - int hpd_gpio;
> - bool hpd_active_low;
> + struct gpio_desc *hpd_gpio;
>  
>   /*
>* On some systems (like the RPi4), some modes are in the same
> -- 
> 2.31.1

Hi Maxime,

This patch as commit 6800234ceee0 ("drm/vc4: hdmi: Convert to gpiod")
causes my Raspberry Pi 3 to lock up shortly after boot in combination
with commit 411efa18e4b0 ("drm/vc4: hdmi: Move the HSM clock enable to
runtime_pm"). The serial console and ssh are completely unresponsive and
I do not see any messages in dmesg with "debug ignore_loglevel". The
device is running with a 32-bit kernel (multi_v7_defconfig) with 32-bit
userspace. If there is any further information that I can provide,
please let me know.

Cheers,
Nathan


Re: [PATCH v15 12/12] of: Add plumbing for restricted DMA pool

2021-07-01 Thread Guenter Roeck
Hi,

On Thu, Jun 24, 2021 at 11:55:26PM +0800, Claire Chang wrote:
> If a device is not behind an IOMMU, we look up the device node and set
> up the restricted DMA when the restricted-dma-pool is presented.
> 
> Signed-off-by: Claire Chang 
> Tested-by: Stefano Stabellini 
> Tested-by: Will Deacon 

With this patch in place, all sparc and sparc64 qemu emulations
fail to boot. Symptom is that the root file system is not found.
Reverting this patch fixes the problem. Bisect log is attached.

Guenter

---
# bad: [fb0ca446157a86b75502c1636b0d81e642fe6bf1] Add linux-next specific files 
for 20210701
# good: [62fb9874f5da54fdb243003b386128037319b219] Linux 5.13
git bisect start 'HEAD' 'v5.13'
# bad: [f63c4fda987a19b1194cc45cb72fd5bf968d9d90] Merge remote-tracking branch 
'rdma/for-next'
git bisect bad f63c4fda987a19b1194cc45cb72fd5bf968d9d90
# good: [46bb5dd1d2a63e906e374e97dfd4a5e33934b1c4] Merge remote-tracking branch 
'ipsec/master'
git bisect good 46bb5dd1d2a63e906e374e97dfd4a5e33934b1c4
# good: [43ba6969cfb8185353a7a6fc79070f13b9e3d6d3] Merge remote-tracking branch 
'clk/clk-next'
git bisect good 43ba6969cfb8185353a7a6fc79070f13b9e3d6d3
# good: [1ca5eddcf8dca1d6345471c6404e7364af0d7019] Merge remote-tracking branch 
'fuse/for-next'
git bisect good 1ca5eddcf8dca1d6345471c6404e7364af0d7019
# good: [8f6d7b3248705920187263a4e7147b0752ec7dcf] Merge remote-tracking branch 
'pci/next'
git bisect good 8f6d7b3248705920187263a4e7147b0752ec7dcf
# good: [df1885a755784da3ef285f36d9230c1d090ef186] RDMA/rtrs_clt: Alloc less 
memory with write path fast memory registration
git bisect good df1885a755784da3ef285f36d9230c1d090ef186
# good: [93d31efb58c8ad4a66bbedbc2d082df458c04e45] Merge remote-tracking branch 
'cpufreq-arm/cpufreq/arm/linux-next'
git bisect good 93d31efb58c8ad4a66bbedbc2d082df458c04e45
# good: [46308965ae6fdc7c25deb2e8c048510ae51bbe66] RDMA/irdma: Check contents 
of user-space irdma_mem_reg_req object
git bisect good 46308965ae6fdc7c25deb2e8c048510ae51bbe66
# good: [6de7a1d006ea9db235492b288312838d6878385f] 
thermal/drivers/int340x/processor_thermal: Split enumeration and processing part
git bisect good 6de7a1d006ea9db235492b288312838d6878385f
# good: [081bec2577cda3d04f6559c60b6f4e2242853520] dt-bindings: of: Add 
restricted DMA pool
git bisect good 081bec2577cda3d04f6559c60b6f4e2242853520
# good: [bf95ac0bcd69979af146852f6a617a60285ebbc1] Merge remote-tracking branch 
'thermal/thermal/linux-next'
git bisect good bf95ac0bcd69979af146852f6a617a60285ebbc1
# good: [3d8287544223a3d2f37981c1f9ffd94d0b5e9ffc] RDMA/core: Always release 
restrack object
git bisect good 3d8287544223a3d2f37981c1f9ffd94d0b5e9ffc
# bad: [cff1f23fad6e0bd7d671acce0d15285c709f259c] Merge remote-tracking branch 
'swiotlb/linux-next'
git bisect bad cff1f23fad6e0bd7d671acce0d15285c709f259c
# bad: [b655006619b7bccd0dc1e055bd72de5d613e7b5c] of: Add plumbing for 
restricted DMA pool
git bisect bad b655006619b7bccd0dc1e055bd72de5d613e7b5c
# first bad commit: [b655006619b7bccd0dc1e055bd72de5d613e7b5c] of: Add plumbing 
for restricted DMA pool


Re: [PATCH v2] drm/panfrost:report the full raw fault information instead

2021-07-01 Thread Chunyou Tang
Hi Steve,
> You didn't answer my previous question:
> 
> > Is this device working with the kbase/DDK proprietary driver?

I don't know whether I used kbase/DDK,I only know I used the driver of
panfrost in linux 5.11.

> What you are describing sounds like a hardware integration issue, so
> it would be good to check that the hardware is working with the
> proprietary driver to rule that out. And perhaps there is something
> in the kbase for this device that is setting a chicken bit to 'fix'
> the coherency?

I don't have the proprietary driver,I only used driver in linux 5.11.

Thinks very much!

Chunyou.


?? Thu, 1 Jul 2021 11:15:14 +0100
Steven Price  :

> On 29/06/2021 04:04, Chunyou Tang wrote:


> > Hi Steve,
> > thinks for your reply.
> > I set the pte in arm_lpae_prot_to_pte(),
> > ***
> > /*
> >  * Also Mali has its own notions of shareability wherein its
> > Inner
> >  * domain covers the cores within the GPU, and its Outer
> > domain is
> >  * "outside the GPU" (i.e. either the Inner or System
> > domain in CPU
> >  * terms, depending on coherency).
> >  */
> > if (prot & IOMMU_CACHE && data->iop.fmt != ARM_MALI_LPAE)
> > pte |= ARM_LPAE_PTE_SH_IS;
> > else
> > pte |= ARM_LPAE_PTE_SH_OS;
> > ***
> > I set pte |= ARM_LPAE_PTE_SH_NS.
> > 
> > If I set pte to ARM_LPAE_PTE_SH_OS or
> > ARM_LPAE_PTE_SH_IS,whether I use singel core GPU or multi
> > core GPU,it will occur GPU Fault.
> > if I set pte to ARM_LPAE_PTE_SH_NS,whether I use singel core
> > GPU or multi core GPU,it will not occur GPU Fault.
> 
> Hi,
> 
> So this is a difference between Panfrost and kbase. Panfrost (well
> technically the IOMMU framework) enables the inner-shareable bit for
> all memory, whereas kbase only enables it for some memory types (the
> BASE_MEM_COHERENT_LOCAL flag in the UABI controls it). However this
> should only be a performance/power difference (and AFAIK probably an
> irrelevant one) and it's definitely required that "inner shareable"
> (i.e. within the GPU) works for communication between the different
> units of the GPU.
> 
> You didn't answer my previous question:
> 
> > Is this device working with the kbase/DDK proprietary driver?
> 
> What you are describing sounds like a hardware integration issue, so
> it would be good to check that the hardware is working with the
> proprietary driver to rule that out. And perhaps there is something
> in the kbase for this device that is setting a chicken bit to 'fix'
> the coherency?
> 
> Steve




Re: [PATCH 06/53] drm/i915/selftests: Allow for larger engine counts

2021-07-01 Thread Lucas De Marchi

On Thu, Jul 01, 2021 at 01:23:40PM -0700, Matt Roper wrote:

From: John Harrison 

Increasing the engine count causes a couple of local array variables
to exceed the kernel stack limit. So make them dynamic allocations
instead.

Signed-off-by: John Harrison 
Signed-off-by: Daniele Ceraolo Spurio 
Signed-off-by: Matt Roper 
---
drivers/gpu/drm/i915/gt/selftest_execlists.c  | 10 --
.../gpu/drm/i915/gt/selftest_workarounds.c| 32 ---
2 files changed, 29 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/selftest_execlists.c 
b/drivers/gpu/drm/i915/gt/selftest_execlists.c
index 08896ae027d5..1e7fe479 100644
--- a/drivers/gpu/drm/i915/gt/selftest_execlists.c
+++ b/drivers/gpu/drm/i915/gt/selftest_execlists.c
@@ -3561,12 +3561,16 @@ static int smoke_crescendo(struct preempt_smoke *smoke, 
unsigned int flags)
#define BATCH BIT(0)
{
struct task_struct *tsk[I915_NUM_ENGINES] = {};
-   struct preempt_smoke arg[I915_NUM_ENGINES];
+   struct preempt_smoke *arg;
struct intel_engine_cs *engine;
enum intel_engine_id id;
unsigned long count;
int err = 0;

+   arg = kmalloc_array(I915_NUM_ENGINES, sizeof(*arg), GFP_KERNEL);
+   if (!arg)
+   return -ENOMEM;
+
for_each_engine(engine, smoke->gt, id) {
arg[id] = *smoke;
arg[id].engine = engine;
@@ -3574,7 +3578,7 @@ static int smoke_crescendo(struct preempt_smoke *smoke, 
unsigned int flags)
arg[id].batch = NULL;
arg[id].count = 0;

-   tsk[id] = kthread_run(smoke_crescendo_thread, ,
+   tsk[id] = kthread_run(smoke_crescendo_thread, arg,
  "igt/smoke:%d", id);
if (IS_ERR(tsk[id])) {
err = PTR_ERR(tsk[id]);
@@ -3603,6 +3607,8 @@ static int smoke_crescendo(struct preempt_smoke *smoke, 
unsigned int flags)

pr_info("Submitted %lu crescendo:%x requests across %d engines and %d 
contexts\n",
count, flags, smoke->gt->info.num_engines, smoke->ncontext);
+
+   kfree(arg);
return 0;



this looks correctly, but apparently this test doesn't test anything as
`err` is write-only - there is only one read, but basically to avoid
overriding an earlier error.

looks like this should be `return err;` ?
+Chris

This patch itself looks good.

Reviewed-by: Lucas De Marchi 

Lucas De Marchi


[PATCH] drm/dbi: Print errors for mipi_dbi_command()

2021-07-01 Thread Linus Walleij
The macro mipi_dbi_command() does not report errors unless you wrap it
in another macro to do the error reporting.

Report a rate-limited error so we know what is going on.

Drop the only user in DRM using mipi_dbi_command() and actually checking
the error explicitly, let it use mipi_dbi_command_buf() directly
instead.

After this any code wishing to send command arrays can rely on
mipi_dbi_command() providing an appropriate error message if something
goes wrong.

Suggested-by: Noralf Trønnes 
Suggested-by: Douglas Anderson 
Signed-off-by: Linus Walleij 
---
 drivers/gpu/drm/drm_mipi_dbi.c | 2 +-
 include/drm/drm_mipi_dbi.h | 5 -
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/drm_mipi_dbi.c b/drivers/gpu/drm/drm_mipi_dbi.c
index 3854fb9798e9..c7c1b75df190 100644
--- a/drivers/gpu/drm/drm_mipi_dbi.c
+++ b/drivers/gpu/drm/drm_mipi_dbi.c
@@ -645,7 +645,7 @@ static int mipi_dbi_poweron_reset_conditional(struct 
mipi_dbi_dev *dbidev, bool
return 1;
 
mipi_dbi_hw_reset(dbi);
-   ret = mipi_dbi_command(dbi, MIPI_DCS_SOFT_RESET);
+   ret = mipi_dbi_command_buf(dbi, MIPI_DCS_SOFT_RESET, NULL, 0);
if (ret) {
DRM_DEV_ERROR(dev, "Failed to send reset command (%d)\n", ret);
if (dbidev->regulator)
diff --git a/include/drm/drm_mipi_dbi.h b/include/drm/drm_mipi_dbi.h
index f543d6e3e822..2057ad32760c 100644
--- a/include/drm/drm_mipi_dbi.h
+++ b/include/drm/drm_mipi_dbi.h
@@ -183,7 +183,10 @@ int mipi_dbi_buf_copy(void *dst, struct drm_framebuffer 
*fb,
 #define mipi_dbi_command(dbi, cmd, seq...) \
 ({ \
const u8 d[] = { seq }; \
-   mipi_dbi_command_stackbuf(dbi, cmd, d, ARRAY_SIZE(d)); \
+   int ret; \
+   ret = mipi_dbi_command_stackbuf(dbi, cmd, d, ARRAY_SIZE(d)); \
+   if (ret) \
+   pr_err_ratelimited("MIPI DBI: error %d when sending command\n", 
ret); \
 })
 
 #ifdef CONFIG_DEBUG_FS
-- 
2.31.1



Re: [Intel-gfx] [PATCH 05/53] drm/i915/gen12: Use fuse info to enable SFC

2021-07-01 Thread Lucas De Marchi

On Thu, Jul 01, 2021 at 01:23:39PM -0700, Matt Roper wrote:

From: Venkata Sandeep Dhanalakota 

In Gen12 there are various fuse combinations and in each configuration
vdbox engine may be connected to SFC depending on which engines are
available, so we need to set the SFC capability based on fuse value from
the hardware. Even numbered phyical instance always have SFC, odd
numbered physical instances have SFC only if previous even instance is
fused off.

Bspec: 48028


considering that in TGL we have physical instances 0 and 2 (both even),
we can use this logic, so it's correct correct for GRAPHICS_VER(i915) == 12.
Although I wonder ifwe should be using MEDIA_VER(i915) here.



Cc: Tvrtko Ursulin 
Cc: Daniele Ceraolo Spurio 
Signed-off-by: Venkata Sandeep Dhanalakota 
Signed-off-by: Matt Roper 



Reviewed-by: Lucas De Marchi 

Lucas De Marchi


---
drivers/gpu/drm/i915/gt/intel_engine_cs.c | 30 ++-
1 file changed, 24 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 151870d8fdd3..4ab2c9abb943 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -442,6 +442,28 @@ void intel_engines_free(struct intel_gt *gt)
}
}

+static inline
+bool vdbox_has_sfc(struct drm_i915_private *i915, unsigned int physical_vdbox,
+  unsigned int logical_vdbox, u16 vdbox_mask)
+{
+   /*
+* In Gen11, only even numbered logical VDBOXes are hooked
+* up to an SFC (Scaler & Format Converter) unit.
+* In Gen12, Even numbered phyical instance always are connected
+* to an SFC. Odd numbered physical instances have SFC only if
+* previous even instance is fused off.
+*/
+   if (GRAPHICS_VER(i915) == 12) {
+   return (physical_vdbox % 2 == 0) ||
+   !(BIT(physical_vdbox - 1) & vdbox_mask);
+   } else if (GRAPHICS_VER(i915) == 11) {
+   return logical_vdbox % 2 == 0;
+   }
+
+   MISSING_CASE(GRAPHICS_VER(i915));
+   return false;
+}
+
/*
 * Determine which engines are fused off in our particular hardware.
 * Note that we have a catch-22 situation where we need to be able to access
@@ -493,13 +515,9 @@ static intel_engine_mask_t init_engine_mask(struct 
intel_gt *gt)
continue;
}

-   /*
-* In Gen11, only even numbered logical VDBOXes are
-* hooked up to an SFC (Scaler & Format Converter) unit.
-* In TGL each VDBOX has access to an SFC.
-*/
-   if (GRAPHICS_VER(i915) >= 12 || logical_vdbox++ % 2 == 0)
+   if (vdbox_has_sfc(i915, i, logical_vdbox, vdbox_mask))
gt->info.vdbox_sfc_access |= BIT(i);
+   logical_vdbox++;
}
drm_dbg(>drm, "vdbox enable: %04x, instances: %04lx\n",
vdbox_mask, VDBOX_MASK(gt));
--
2.25.4

___
Intel-gfx mailing list
intel-...@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [PATCH 04/53] drm/i915/xehp: VDBOX/VEBOX fusing registers are enable-based

2021-07-01 Thread Lucas De Marchi

On Thu, Jul 01, 2021 at 01:23:38PM -0700, Matt Roper wrote:

From: Tvrtko Ursulin 

On Xe_HP the fusing register is renamed and changed to have the "enable"
semantics, but otherwise remains compatible (mmio address, bitmask
ranges) with older platforms.

To simplify things we do not add a new register definition but just stop
inverting the fusing masks before processing them.

Bspec: 33288


This is now:

Bspec: 52615


Cc: Daniele Ceraolo Spurio 
Signed-off-by: Tvrtko Ursulin 
Signed-off-by: Matt Roper 



this change above,

Reviewed-by: Lucas De Marchi 

Lucas De Marchi


---
drivers/gpu/drm/i915/gt/intel_engine_cs.c | 9 -
1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 88694822716a..151870d8fdd3 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -468,7 +468,14 @@ static intel_engine_mask_t init_engine_mask(struct 
intel_gt *gt)
if (GRAPHICS_VER(i915) < 11)
return info->engine_mask;

-   media_fuse = ~intel_uncore_read(uncore, GEN11_GT_VEBOX_VDBOX_DISABLE);
+   /*
+* On newer platforms the fusing register is called 'enable' and has
+* enable semantics, while on older platforms it is called 'disable'
+* and bits have disable semantices.
+*/
+   media_fuse = intel_uncore_read(uncore, GEN11_GT_VEBOX_VDBOX_DISABLE);
+   if (GRAPHICS_VER_FULL(i915) < IP_VER(12, 50))
+   media_fuse = ~media_fuse;

vdbox_mask = media_fuse & GEN11_GT_VDBOX_DISABLE_MASK;
vebox_mask = (media_fuse & GEN11_GT_VEBOX_DISABLE_MASK) >>
--
2.25.4



Re: [PATCH 16/53] drm/i915/xehpsdv: add initial XeHP SDV definitions

2021-07-01 Thread Rodrigo Vivi
On Thu, Jul 01, 2021 at 01:23:50PM -0700, Matt Roper wrote:
> From: Lucas De Marchi 
> 
> XeHP SDV is a Intel® dGPU without display. This is just the definition
> of some basic platform macros, by large a copy of current state of
> Tigerlake which does not reflect the end state of this platform.
> 
> Bspec: 44467, 48077
> Cc: Rodrigo Vivi 
> Signed-off-by: Lucas De Marchi 
> Signed-off-by: Daniele Ceraolo Spurio 
> Signed-off-by: José Roberto de Souza 
> Signed-off-by: Stuart Summers 
> Signed-off-by: Tomas Winkler 
> Signed-off-by: Matt Roper 

Reviewed-by: Rodrigo Vivi 

> ---
>  drivers/gpu/drm/i915/i915_drv.h  | 10 ++
>  drivers/gpu/drm/i915/i915_pci.c  | 20 
>  drivers/gpu/drm/i915/intel_device_info.c |  1 +
>  drivers/gpu/drm/i915/intel_device_info.h |  1 +
>  4 files changed, 32 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index c02600850246..63bed18a2be7 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1406,6 +1406,7 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
>  #define IS_DG1(dev_priv)IS_PLATFORM(dev_priv, INTEL_DG1)
>  #define IS_ALDERLAKE_S(dev_priv) IS_PLATFORM(dev_priv, INTEL_ALDERLAKE_S)
>  #define IS_ALDERLAKE_P(dev_priv) IS_PLATFORM(dev_priv, INTEL_ALDERLAKE_P)
> +#define IS_XEHPSDV(dev_priv) IS_PLATFORM(dev_priv, INTEL_XEHPSDV)
>  #define IS_HSW_EARLY_SDV(dev_priv) (IS_HASWELL(dev_priv) && \
>   (INTEL_DEVID(dev_priv) & 0xFF00) == 0x0C00)
>  #define IS_BDW_ULT(dev_priv) \
> @@ -1564,6 +1565,15 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
>   (IS_ALDERLAKE_P(__i915) && \
>IS_GT_STEP(__i915, since, until))
>  
> +#define XEHPSDV_REVID_A0 0x0
> +#define XEHPSDV_REVID_A1 0x1
> +#define XEHPSDV_REVID_A_LAST XEHPSDV_REVID_A1
> +#define XEHPSDV_REVID_B0 0x4
> +#define XEHPSDV_REVID_C0 0x8
> +
> +#define IS_XEHPSDV_REVID(p, since, until) \
> + (IS_XEHPSDV(p) && IS_REVID(p, since, until))
> +
>  #define IS_LP(dev_priv)  (INTEL_INFO(dev_priv)->is_lp)
>  #define IS_GEN9_LP(dev_priv) (GRAPHICS_VER(dev_priv) == 9 && IS_LP(dev_priv))
>  #define IS_GEN9_BC(dev_priv) (GRAPHICS_VER(dev_priv) == 9 && 
> !IS_LP(dev_priv))
> diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
> index 88b279452b87..046309e95f43 100644
> --- a/drivers/gpu/drm/i915/i915_pci.c
> +++ b/drivers/gpu/drm/i915/i915_pci.c
> @@ -1020,6 +1020,26 @@ static const struct intel_device_info adl_p_info = {
>   .ppgtt_size = 48, \
>   .ppgtt_type = INTEL_PPGTT_FULL
>  
> +#define XE_HPM_FEATURES \
> + .media_ver = 12, \
> + .media_ver_release = 50
> +
> +__maybe_unused
> +static const struct intel_device_info xehpsdv_info = {
> + XE_HP_FEATURES,
> + XE_HPM_FEATURES,
> + DGFX_FEATURES,
> + PLATFORM(INTEL_XEHPSDV),
> + .display = { },
> + .pipe_mask = 0,
> + .platform_engine_mask =
> + BIT(RCS0) | BIT(BCS0) |
> + BIT(VECS0) | BIT(VECS1) | BIT(VECS2) | BIT(VECS3) |
> + BIT(VCS0) | BIT(VCS1) | BIT(VCS2) | BIT(VCS3) |
> + BIT(VCS4) | BIT(VCS5) | BIT(VCS6) | BIT(VCS7),
> + .require_force_probe = 1,
> +};
> +
>  #undef PLATFORM
>  
>  /*
> diff --git a/drivers/gpu/drm/i915/intel_device_info.c 
> b/drivers/gpu/drm/i915/intel_device_info.c
> index e8ad14f002c1..7b37b68f4548 100644
> --- a/drivers/gpu/drm/i915/intel_device_info.c
> +++ b/drivers/gpu/drm/i915/intel_device_info.c
> @@ -68,6 +68,7 @@ static const char * const platform_names[] = {
>   PLATFORM_NAME(DG1),
>   PLATFORM_NAME(ALDERLAKE_S),
>   PLATFORM_NAME(ALDERLAKE_P),
> + PLATFORM_NAME(XEHPSDV),
>  };
>  #undef PLATFORM_NAME
>  
> diff --git a/drivers/gpu/drm/i915/intel_device_info.h 
> b/drivers/gpu/drm/i915/intel_device_info.h
> index f824de632cfe..e8684199b0c9 100644
> --- a/drivers/gpu/drm/i915/intel_device_info.h
> +++ b/drivers/gpu/drm/i915/intel_device_info.h
> @@ -88,6 +88,7 @@ enum intel_platform {
>   INTEL_DG1,
>   INTEL_ALDERLAKE_S,
>   INTEL_ALDERLAKE_P,
> + INTEL_XEHPSDV,
>   INTEL_MAX_PLATFORMS
>  };
>  
> -- 
> 2.25.4
> 


Re: [PATCH 23/53] drm/i915/xehpsdv: Read correct RP_STATE_CAP register

2021-07-01 Thread Rodrigo Vivi
On Thu, Jul 01, 2021 at 01:23:57PM -0700, Matt Roper wrote:
> The RP_STATE_CAP register is no longer part of the MCHBAR on XEHPSDV; this
> register is now a per-tile register at GTTMMADDR offset 0x250014.
> 
> Cc: Rodrigo Vivi 
> Signed-off-by: Matt Roper 
> Signed-off-by: Lucas De Marchi 

Reviewed-by: Rodrigo Vivi 

> ---
>  drivers/gpu/drm/i915/gt/intel_rps.c | 4 +++-
>  drivers/gpu/drm/i915/i915_reg.h | 1 +
>  2 files changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c 
> b/drivers/gpu/drm/i915/gt/intel_rps.c
> index 490bc1513480..8e7b70248392 100644
> --- a/drivers/gpu/drm/i915/gt/intel_rps.c
> +++ b/drivers/gpu/drm/i915/gt/intel_rps.c
> @@ -1937,7 +1937,9 @@ u32 intel_rps_read_state_cap(struct intel_rps *rps)
>   struct drm_i915_private *i915 = rps_to_i915(rps);
>   struct intel_uncore *uncore = rps_to_uncore(rps);
>  
> - if (IS_GEN9_LP(i915))
> + if (IS_XEHPSDV(i915))
> + return intel_uncore_read(uncore, XEHPSDV_RP_STATE_CAP);
> + else if (IS_GEN9_LP(i915))
>   return intel_uncore_read(uncore, BXT_RP_STATE_CAP);
>   else
>   return intel_uncore_read(uncore, GEN6_RP_STATE_CAP);
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index 0231f42226db..2992e8585399 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -4110,6 +4110,7 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
>  #define GEN6_RP_STATE_CAP_MMIO(MCHBAR_MIRROR_BASE_SNB + 0x5998)
>  #define BXT_RP_STATE_CAP_MMIO(0x138170)
>  #define GEN9_RP_STATE_LIMITS _MMIO(0x138148)
> +#define XEHPSDV_RP_STATE_CAP _MMIO(0x250014)
>  
>  /*
>   * Logical Context regs
> -- 
> 2.25.4
> 


[PATCH] drm/panel: Fix up DT bindings for Samsung lms397kf04

2021-07-01 Thread Linus Walleij
Improve the bindings and make them more usable:

- Pick in spi-cpha and spi-cpol from the SPI node parent,
  this will specify that we are "type 3" in the device tree
  rather than hardcoding it in the operating system.
- Drop the u32 ref from the SPI frequency: comes in from
  the SPI host bindings.
- Make spi-cpha, spi-cpol and port compulsory.
- Update the example with a real-world SPI controller,
  spi-gpio.

Cc: Douglas Anderson 
Cc: Noralf Trønnes 
Cc: devicet...@vger.kernel.org
Signed-off-by: Linus Walleij 
---
 .../display/panel/samsung,lms397kf04.yaml  | 18 --
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git 
a/Documentation/devicetree/bindings/display/panel/samsung,lms397kf04.yaml 
b/Documentation/devicetree/bindings/display/panel/samsung,lms397kf04.yaml
index 4cb75a5f2e3a..cd62968426fb 100644
--- a/Documentation/devicetree/bindings/display/panel/samsung,lms397kf04.yaml
+++ b/Documentation/devicetree/bindings/display/panel/samsung,lms397kf04.yaml
@@ -33,8 +33,11 @@ properties:
 
   backlight: true
 
+  spi-cpha: true
+
+  spi-cpol: true
+
   spi-max-frequency:
-$ref: /schemas/types.yaml#/definitions/uint32
 description: inherited as a SPI client node, the datasheet specifies
   maximum 300 ns minimum cycle which gives around 3 MHz max frequency
 maximum: 300
@@ -44,6 +47,9 @@ properties:
 required:
   - compatible
   - reg
+  - spi-cpha
+  - spi-cpol
+  - port
 
 additionalProperties: false
 
@@ -52,15 +58,23 @@ examples:
 #include 
 
 spi {
+  compatible = "spi-gpio";
+  sck-gpios = < 0 GPIO_ACTIVE_HIGH>;
+  miso-gpios = < 1 GPIO_ACTIVE_HIGH>;
+  mosi-gpios = < 2 GPIO_ACTIVE_HIGH>;
+  cs-gpios = < 3 GPIO_ACTIVE_HIGH>;
+  num-chipselects = <1>;
   #address-cells = <1>;
   #size-cells = <0>;
   panel@0 {
 compatible = "samsung,lms397kf04";
 spi-max-frequency = <300>;
+spi-cpha;
+spi-cpol;
 reg = <0>;
 vci-supply = <_3v0_reg>;
 vccio-supply = <_1v8_reg>;
-reset-gpios = < 1 GPIO_ACTIVE_LOW>;
+reset-gpios = < 4 GPIO_ACTIVE_LOW>;
 backlight = <>;
 
 port {
-- 
2.31.1



Re: [PATCH v2] drm/meson: fix potential NULL pointer exception in meson_drv_unbind()

2021-07-01 Thread Martin Blumenstingl
Hello,

first of all: thanks for your patch and sorry for being late with my
review question.

On Fri, Jun 18, 2021 at 7:28 AM Jiajun Cao  wrote:
>
> Fix a potential NULL pointer exception when meson_drv_unbind()
> attempts to operate on the driver_data priv which may be NULL.
> Add a null pointer check on the priv struct to avoid the NULL
> pointer dereference after calling dev_get_drvdata(), just like
> the null pointer checks done on the struct priv in the function
> meson_drv_shutdown(), meson_drv_pm_suspend() and meson_drv_pm_resume().
I am trying to review Amlogic Meson related patches in the DRM
subsystem so I can help Neil with this.
However, I am still new to this so please help me educate on this topic.

[...]
>  static void meson_drv_unbind(struct device *dev)
>  {
> struct meson_drm *priv = dev_get_drvdata(dev);
> -   struct drm_device *drm = priv->drm;
> +   struct drm_device *drm;
> +
> +   if (!priv)
> +   return;
My understanding of the component framework is that meson_drv_unbind()
is only called if previously meson_drv_bind() was called (and did not
return any error).
This is different from meson_drv_shutdown() (for example) because that
can be called if meson_drv_probe() returns 0 (success) in case the
"count" variable was 0 (then the probe function does nothing).

As I mentioned before: I am still learning about the DRM subsystem in
the Linux kernel.
So it would be great if you could help me understand for which
scenarios this newly added if-condition is needed.


Thank you!
Best regards,
Martin


Re: [git pull] drm for 5.14-rc1

2021-07-01 Thread Felix Kuehling
Am 2021-07-01 um 4:15 p.m. schrieb Linus Torvalds:
> On Wed, Jun 30, 2021 at 9:34 PM Dave Airlie  wrote:
>> Hi Linus,
>>
>> This is the main drm pull request for 5.14-rc1.
>>
>> I've done a test pull into your current tree, and hit two conflicts
>> (one in vc4, one in amdgpu), both seem pretty trivial, the amdgpu one
>> is recent and sfr sent out a resolution for it today.
> Well, the resolutions may be trivial, but the conflict made me look at
> the code, and it's buggy.
>
> Commit 04d8d73dbcbe ("drm/amdgpu: add common HMM get pages function")
> is broken. It made the code do
>
> mmap_read_lock(mm);
> vma = find_vma(mm, start);
> mmap_read_unlock(mm);
>
> and then it *uses* that "vma" after it has dropped the lock.
>
> That's a big no-no - once you've dropped the lock, the vma contents
> simply aren't reliable any more. That mapping could now be unmapped
> and removed at any time.
>
> Now, the conflict actually made one of the uses go away (switching to
> vma_lookup() means that the subsequent code no longer needs to look at
> "vm_start" to verify we're actually _inside_ the vma), but it still
> checks for vma->vm_file afterwards.
>
> So those locking changes in commit 04d8d73dbcbe are completely bogus.
>
> I tried to fix up that bug while handling the conflict, but who knows
> what else similar is going on elsewhere.
>
> So I would ask people to
>
>  (a) verify that I didn't make things worse as I fixed things up (note
> how I had to change the last argument to amdgpu_hmm_range_get_pages()
> from false to true etc).
>
>  (b) go and look at their vma lookup code: you can't just look up a
> vma under the lock, and then drop the lock, and then think things stay
> stable.
>
> In particular for that (b) case: it is *NOT* enough to look up
> vma->vm_file inside the lock and cache that. No - if the test is about
> "no backing file before looking up pages", then you have to *keep*
> holding the lock until after you've actually looked up the pages!
>
> Because otherwise any test for "vma->vm_file" is entirely pointless,
> for the same reason it's buggy to even look at it after dropping the
> lock: because once you've dropped the lock, the thing you just tested
> for might not be true any more.
>
> So no, it's not valid to do
>
> bool has_file = vma && vma->vm_file;
>
> and then drop the lock, because you don't use 'vma' any more as a
> pointer, and then use 'has_file' outside the lock. Because after
> you've dropped the lock, 'has_file' is now meaningless.
>
> So it's not just about "you can't look at vma->vm_file after dropping
> the lock". It's more fundamental than that. Any *decision* you make
> based on the vma is entirely pointless and moot after the lock is
> dropped!
>
> Did I fix it up correctly? Who knows. The code makes more sense to me
> now and seems valid. But I really *really* want to stress how locking
> is important.

Thank you for the fix and the explanation. Your fix looks correct. I
also double-checked all other uses of find_vma in the amdgpu driver.
They all hold the mmap lock correctly.

Two comments:

With this fix, we could remove the bool mmap_locked parameter from
amdgpu_hmm_range_get_pages because it always gets called with the lock
held now.

You're now holding the mmap lock from the vma_lookup until
hmm_range_fault is done. This ensures that the result of the
vma->vm_file check remains valid. This was broken even before our commit
04d8d73dbcbe ("drm/amdgpu: add common HMM get pages function").


>
> You also can't just unlock in the middle of an operation - even if you
> then take the lock *again* later (as amdgpu_hmm_range_get_pages() then
> did), the fact that you unlocked in the middle means that all the
> earlier tests you did are simply no longer valid when you re-take the
> lock.

I agree completely. I catch a lot of locking bugs in code review. I
probably missed this one because I wasn't paying enough attention to
what was being protected by the mmap_read_lock in this case.

Regards,
  Felix


>
>  Linus


Re: [Freedreno] [RFC 2/6] drm/msm/dpu: support setting up two independent DSI connectors

2021-07-01 Thread abhinavk

On 2021-06-09 14:17, Dmitry Baryshkov wrote:

Move setting up encoders from set_encoder_mode to
_dpu_kms_initialize_dsi() / _dpu_kms_initialize_displayport(). This
allows us to support not only "single DSI" and "dual DSI" but also "two
independent DSI" configurations. In future this would also help adding
support for multiple DP connectors.

Signed-off-by: Dmitry Baryshkov 
I will have to see Bjorn's changes to check why it was dependent on this 
cleanup.

Is the plan to call _dpu_kms_initialize_displayport() twice?
But still I am not able to put together where is the dependency on that 
series

with this one. Can you please elaborate on that a little bit?


---
 drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c | 89 -
 1 file changed, 44 insertions(+), 45 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
index 1d3a4f395e74..b63e1c948ff2 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
@@ -471,30 +471,55 @@ static int _dpu_kms_initialize_dsi(struct 
drm_device *dev,

struct dpu_kms *dpu_kms)
 {
struct drm_encoder *encoder = NULL;
+   struct msm_display_info info;
int i, rc = 0;

if (!(priv->dsi[0] || priv->dsi[1]))
return rc;

-   /*TODO: Support two independent DSI connectors */
-   encoder = dpu_encoder_init(dev, DRM_MODE_ENCODER_DSI);
-   if (IS_ERR(encoder)) {
-   DPU_ERROR("encoder init failed for dsi display\n");
-   return PTR_ERR(encoder);
-   }
-
-   priv->encoders[priv->num_encoders++] = encoder;
-
for (i = 0; i < ARRAY_SIZE(priv->dsi); i++) {
if (!priv->dsi[i])
continue;

+   if (!encoder) {
+   encoder = dpu_encoder_init(dev, DRM_MODE_ENCODER_DSI);
+   if (IS_ERR(encoder)) {
+   DPU_ERROR("encoder init failed for dsi 
display\n");
+   return PTR_ERR(encoder);
+   }
+
+   priv->encoders[priv->num_encoders++] = encoder;
+
+   memset(, 0, sizeof(info));
+   info.intf_type = encoder->encoder_type;
+   info.capabilities = msm_dsi_is_cmd_mode(priv->dsi[i]) ?
+   MSM_DISPLAY_CAP_CMD_MODE :
+   MSM_DISPLAY_CAP_VID_MODE;
+   }
+
rc = msm_dsi_modeset_init(priv->dsi[i], dev, encoder);
if (rc) {
DPU_ERROR("modeset_init failed for dsi[%d], rc = %d\n",
i, rc);
break;
}
+
+   info.h_tile_instance[info.num_of_h_tiles++] = i;
+
+   if (!msm_dsi_is_dual_dsi(priv->dsi[i])) {


I would like to clarify the terminology of dual_dsi in the current DSI 
driver before the rest of the reviews.
Today IS_DUAL_DSI() means that two DSIs are driving the same display and 
the two DSIs are operating in master-slave mode

and are being driven by the same PLL.
Usually, dual independent DSI means two DSIs driving two separate panels 
using two separate PLLs ( DSI0 with PLL0 and DSI1 with PLL1)
I assume thats happening due to the foll logic and both DSI PHYs are 
operating in STANDALONE mode:


if (!IS_DUAL_DSI()) {
ret = msm_dsi_host_register(msm_dsi->host, true);
if (ret)
return ret;

msm_dsi_phy_set_usecase(msm_dsi->phy, MSM_DSI_PHY_STANDALONE);
ret = msm_dsi_host_set_src_pll(msm_dsi->host, msm_dsi->phy);


+   rc = dpu_encoder_setup(dev, encoder, );
+   if (rc)
+   DPU_ERROR("failed to setup DPU encoder %d: 
rc:%d\n",
+   encoder->base.id, rc);
+   encoder = NULL;
+   }
+   }
+
+   if (encoder) {


We will hit this case only for split-DSI right? ( that is two DSIs 
driving the same panel ).
Even single DSI will be created in the above loop now. So this looks a 
bit confusing at the moment.


I think we need to be more clear on dual-DSI Vs split-DSI to avoid 
confusion in the code about which one means what and the one
which we are currently using. So what about having IS_DUAL_DSI() and 
IS_SPLIT_DSI() to distinguish the terminologies and chaging

DSI driver accordingly.


+   rc = dpu_encoder_setup(dev, encoder, );
+   if (rc)
+   DPU_ERROR("failed to setup DPU encoder %d: rc:%d\n",
+   encoder->base.id, rc);
}

return rc;
@@ -505,6 +530,7 @@ static int _dpu_kms_initialize_displayport(struct
drm_device *dev,
struct dpu_kms *dpu_kms)
 {
struct drm_encoder *encoder = NULL;
+

Re: [PATCH] drm/msm/dsi: drop gdsc regulator handling

2021-07-01 Thread Bjorn Andersson
On Wed 30 Jun 19:00 CDT 2021, Dmitry Baryshkov wrote:

> None of supported devies uses "gdsc" regulator for DSI. GDSC support is
> now implemented as a power domain. Drop old code and config handling
> gdsc regulator requesting and enabling.
> 
> Signed-off-by: Dmitry Baryshkov 

Reviewed-by: Bjorn Andersson 

Regards,
Bjorn

> ---
>  drivers/gpu/drm/msm/dsi/dsi_cfg.c  | 12 
>  drivers/gpu/drm/msm/dsi/dsi_host.c | 22 +++---
>  2 files changed, 7 insertions(+), 27 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/dsi/dsi_cfg.c 
> b/drivers/gpu/drm/msm/dsi/dsi_cfg.c
> index f3f1c03c7db9..32c37d7c2109 100644
> --- a/drivers/gpu/drm/msm/dsi/dsi_cfg.c
> +++ b/drivers/gpu/drm/msm/dsi/dsi_cfg.c
> @@ -32,9 +32,8 @@ static const char * const dsi_6g_bus_clk_names[] = {
>  static const struct msm_dsi_config msm8974_apq8084_dsi_cfg = {
>   .io_offset = DSI_6G_REG_SHIFT,
>   .reg_cfg = {
> - .num = 4,
> + .num = 3,
>   .regs = {
> - {"gdsc", -1, -1},
>   {"vdd", 15, 100},   /* 3.0 V */
>   {"vdda", 10, 100},  /* 1.2 V */
>   {"vddio", 10, 100}, /* 1.8 V */
> @@ -53,9 +52,8 @@ static const char * const dsi_8916_bus_clk_names[] = {
>  static const struct msm_dsi_config msm8916_dsi_cfg = {
>   .io_offset = DSI_6G_REG_SHIFT,
>   .reg_cfg = {
> - .num = 3,
> + .num = 2,
>   .regs = {
> - {"gdsc", -1, -1},
>   {"vdda", 10, 100},  /* 1.2 V */
>   {"vddio", 10, 100}, /* 1.8 V */
>   },
> @@ -73,9 +71,8 @@ static const char * const dsi_8976_bus_clk_names[] = {
>  static const struct msm_dsi_config msm8976_dsi_cfg = {
>   .io_offset = DSI_6G_REG_SHIFT,
>   .reg_cfg = {
> - .num = 3,
> + .num = 2,
>   .regs = {
> - {"gdsc", -1, -1},
>   {"vdda", 10, 100},  /* 1.2 V */
>   {"vddio", 10, 100}, /* 1.8 V */
>   },
> @@ -89,9 +86,8 @@ static const struct msm_dsi_config msm8976_dsi_cfg = {
>  static const struct msm_dsi_config msm8994_dsi_cfg = {
>   .io_offset = DSI_6G_REG_SHIFT,
>   .reg_cfg = {
> - .num = 7,
> + .num = 6,
>   .regs = {
> - {"gdsc", -1, -1},
>   {"vdda", 10, 100},  /* 1.25 V */
>   {"vddio", 10, 100}, /* 1.8 V */
>   {"vcca", 1, 100},   /* 1.0 V */
> diff --git a/drivers/gpu/drm/msm/dsi/dsi_host.c 
> b/drivers/gpu/drm/msm/dsi/dsi_host.c
> index ed504fe5074f..66c425d4159c 100644
> --- a/drivers/gpu/drm/msm/dsi/dsi_host.c
> +++ b/drivers/gpu/drm/msm/dsi/dsi_host.c
> @@ -203,35 +203,22 @@ static const struct msm_dsi_cfg_handler *dsi_get_config(
>  {
>   const struct msm_dsi_cfg_handler *cfg_hnd = NULL;
>   struct device *dev = _host->pdev->dev;
> - struct regulator *gdsc_reg;
>   struct clk *ahb_clk;
>   int ret;
>   u32 major = 0, minor = 0;
>  
> - gdsc_reg = regulator_get(dev, "gdsc");
> - if (IS_ERR(gdsc_reg)) {
> - pr_err("%s: cannot get gdsc\n", __func__);
> - goto exit;
> - }
> -
>   ahb_clk = msm_clk_get(msm_host->pdev, "iface");
>   if (IS_ERR(ahb_clk)) {
>   pr_err("%s: cannot get interface clock\n", __func__);
> - goto put_gdsc;
> + goto exit;
>   }
>  
>   pm_runtime_get_sync(dev);
>  
> - ret = regulator_enable(gdsc_reg);
> - if (ret) {
> - pr_err("%s: unable to enable gdsc\n", __func__);
> - goto put_gdsc;
> - }
> -
>   ret = clk_prepare_enable(ahb_clk);
>   if (ret) {
>   pr_err("%s: unable to enable ahb_clk\n", __func__);
> - goto disable_gdsc;
> + goto runtime_put;
>   }
>  
>   ret = dsi_get_version(msm_host->ctrl_base, , );
> @@ -246,11 +233,8 @@ static const struct msm_dsi_cfg_handler *dsi_get_config(
>  
>  disable_clks:
>   clk_disable_unprepare(ahb_clk);
> -disable_gdsc:
> - regulator_disable(gdsc_reg);
> +runtime_put:
>   pm_runtime_put_sync(dev);
> -put_gdsc:
> - regulator_put(gdsc_reg);
>  exit:
>   return cfg_hnd;
>  }
> -- 
> 2.30.2
> 


[PATCH 12/53] drm/i915/xehp: Handle new device context ID format

2021-07-01 Thread Matt Roper
From: Stuart Summers 

Xe_HP changes the format of the context ID from past platforms.

Cc: Robert M. Fosha 
Signed-off-by: Stuart Summers 
Signed-off-by: Umesh Nerlige Ramappa 
Signed-off-by: Matt Roper 
---
 .../drm/i915/gt/intel_execlists_submission.c  | 74 ---
 drivers/gpu/drm/i915/gt/intel_lrc.c   |  8 ++
 drivers/gpu/drm/i915/gt/intel_lrc_reg.h   |  2 +
 drivers/gpu/drm/i915/i915_perf.c  | 29 +---
 drivers/gpu/drm/i915/i915_reg.h   |  5 ++
 5 files changed, 97 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c 
b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
index 15ba0d83151a..3a9d99a69ed4 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -153,6 +153,12 @@
 #define GEN12_CSB_CTX_VALID(csb_dw) \
(FIELD_GET(GEN12_CSB_SW_CTX_ID_MASK, csb_dw) != GEN12_IDLE_CTX_ID)
 
+#define XEHP_CTX_STATUS_SWITCHED_TO_NEW_QUEUE  BIT(1) /* upper csb dword */
+#define XEHP_CSB_SW_CTX_ID_MASKGENMASK(31, 10)
+#define XEHP_IDLE_CTX_ID   0x
+#define XEHP_CSB_CTX_VALID(csb_dw) \
+   (FIELD_GET(XEHP_CSB_SW_CTX_ID_MASK, csb_dw) != XEHP_IDLE_CTX_ID)
+
 /* Typical size of the average request (2 pipecontrols and a MI_BB) */
 #define EXECLISTS_REQUEST_SIZE 64 /* bytes */
 
@@ -490,6 +496,16 @@ __execlists_schedule_in(struct i915_request *rq)
/* Use a fixed tag for OA and friends */
GEM_BUG_ON(ce->tag <= BITS_PER_LONG);
ce->lrc.ccid = ce->tag;
+   } else if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 50)) {
+   /* We don't need a strict matching tag, just different values */
+   unsigned int tag = ffs(READ_ONCE(engine->context_tag));
+
+   GEM_BUG_ON(tag == 0 || tag >= BITS_PER_LONG);
+   clear_bit(tag - 1, >context_tag);
+   ce->lrc.ccid = tag << (XEHP_SW_CTX_ID_SHIFT - 32);
+
+   BUILD_BUG_ON(BITS_PER_LONG > GEN12_MAX_CONTEXT_HW_ID);
+
} else {
/* We don't need a strict matching tag, just different values */
unsigned int tag = __ffs(engine->context_tag);
@@ -600,8 +616,14 @@ static void __execlists_schedule_out(struct i915_request * 
const rq,
intel_engine_add_retire(engine, ce->timeline);
 
ccid = ce->lrc.ccid;
-   ccid >>= GEN11_SW_CTX_ID_SHIFT - 32;
-   ccid &= GEN12_MAX_CONTEXT_HW_ID;
+   if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 50)) {
+   ccid >>= XEHP_SW_CTX_ID_SHIFT - 32;
+   ccid &= XEHP_MAX_CONTEXT_HW_ID;
+   } else {
+   ccid >>= GEN11_SW_CTX_ID_SHIFT - 32;
+   ccid &= GEN12_MAX_CONTEXT_HW_ID;
+   }
+
if (ccid < BITS_PER_LONG) {
GEM_BUG_ON(ccid == 0);
GEM_BUG_ON(test_bit(ccid - 1, >context_tag));
@@ -1660,13 +1682,24 @@ static void invalidate_csb_entries(const u64 *first, 
const u64 *last)
  * bits 44-46: reserved
  * bits 47-57: sw context id of the lrc the GT switched away from
  * bits 58-63: sw counter of the lrc the GT switched away from
+ *
+ * Xe_HP csb shuffles things around compared to TGL:
+ *
+ * bits 0-3:   context switch detail (same possible values as TGL)
+ * bits 4-9:   engine instance
+ * bits 10-25: sw context id of the lrc the GT switched to
+ * bits 26-31: sw counter of the lrc the GT switched to
+ * bit  32:semaphore wait mode (poll or signal), Only valid when
+ * switch detail is set to "wait on semaphore"
+ * bit  33:switched to new queue
+ * bits 34-41: wait detail (for switch detail 1 to 4)
+ * bits 42-57: sw context id of the lrc the GT switched away from
+ * bits 58-63: sw counter of the lrc the GT switched away from
  */
-static bool gen12_csb_parse(const u64 csb)
+static inline bool
+__gen12_csb_parse(bool ctx_to_valid, bool ctx_away_valid, bool new_queue,
+ u8 switch_detail)
 {
-   bool ctx_away_valid = GEN12_CSB_CTX_VALID(upper_32_bits(csb));
-   bool new_queue =
-   lower_32_bits(csb) & GEN12_CTX_STATUS_SWITCHED_TO_NEW_QUEUE;
-
/*
 * The context switch detail is not guaranteed to be 5 when a preemption
 * occurs, so we can't just check for that. The check below works for
@@ -1675,7 +1708,7 @@ static bool gen12_csb_parse(const u64 csb)
 * would require some extra handling, but we don't support that.
 */
if (!ctx_away_valid || new_queue) {
-   GEM_BUG_ON(!GEN12_CSB_CTX_VALID(lower_32_bits(csb)));
+   GEM_BUG_ON(!ctx_to_valid);
return true;
}
 
@@ -1684,10 +1717,26 @@ static bool gen12_csb_parse(const u64 csb)
 * context switch on an unsuccessful wait instruction since we always
 * use polling mode.
 */
-   

[PATCH 40/53] drm/i915/dg2: Don't read DRAM info

2021-07-01 Thread Matt Roper
DG2 does not use system DRAM information for BW_BUDDY programming or
watermark workarounds, so there's no need to read this out at startup.

Cc: Anusha Srivatsa 
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/intel_dram.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_dram.c 
b/drivers/gpu/drm/i915/intel_dram.c
index 879b0f007be3..9675bb94b70b 100644
--- a/drivers/gpu/drm/i915/intel_dram.c
+++ b/drivers/gpu/drm/i915/intel_dram.c
@@ -494,15 +494,15 @@ void intel_dram_detect(struct drm_i915_private *i915)
struct dram_info *dram_info = >dram_info;
int ret;
 
+   if (GRAPHICS_VER(i915) < 9 || IS_DG2(i915) || !HAS_DISPLAY(i915))
+   return;
+
/*
 * Assume level 0 watermark latency adjustment is needed until proven
 * otherwise, this w/a is not needed by bxt/glk.
 */
dram_info->wm_lv_0_adjust_needed = !IS_GEN9_LP(i915);
 
-   if (GRAPHICS_VER(i915) < 9 || !HAS_DISPLAY(i915))
-   return;
-
if (GRAPHICS_VER(i915) >= 12)
ret = gen12_get_dram_info(i915);
else if (GRAPHICS_VER(i915) >= 11)
-- 
2.25.4



[PATCH 35/53] drm/i915/dg2: Skip shared DPLL handling

2021-07-01 Thread Matt Roper
DG2 has no shared DPLL's or DDI clock muxing.  The Port PLL is embedded
within the PHY.

Bspec: 54032
Bspec: 54034
Cc: Lucas De Marchi 
Cc: Mohammed Khajapasha 
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/display/intel_display.c  | 10 +++---
 drivers/gpu/drm/i915/display/intel_dpll_mgr.c |  5 -
 2 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_display.c 
b/drivers/gpu/drm/i915/display/intel_display.c
index 026c28c612f0..c673d0c8fb4a 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -3474,7 +3474,8 @@ static void icl_ddi_bigjoiner_pre_enable(struct 
intel_atomic_state *state,
 * Enable sequence steps 1-7 on bigjoiner master
 */
intel_encoders_pre_pll_enable(state, master);
-   intel_enable_shared_dpll(master_crtc_state);
+   if (master_crtc_state->shared_dpll)
+   intel_enable_shared_dpll(master_crtc_state);
intel_encoders_pre_enable(state, master);
 
/* and DSC on slave */
@@ -8633,10 +8634,11 @@ intel_pipe_config_compare(const struct intel_crtc_state 
*current_config,
 
PIPE_CONF_CHECK_BOOL(double_wide);
 
-   PIPE_CONF_CHECK_P(shared_dpll);
+   if (dev_priv->dpll.mgr)
+   PIPE_CONF_CHECK_P(shared_dpll);
 
/* FIXME do the readout properly and get rid of this quirk */
-   if (!PIPE_CONF_QUIRK(PIPE_CONFIG_QUIRK_BIGJOINER_SLAVE)) {
+   if (dev_priv->dpll.mgr && 
!PIPE_CONF_QUIRK(PIPE_CONFIG_QUIRK_BIGJOINER_SLAVE)) {
PIPE_CONF_CHECK_X(dpll_hw_state.dpll);
PIPE_CONF_CHECK_X(dpll_hw_state.dpll_md);
PIPE_CONF_CHECK_X(dpll_hw_state.fp0);
@@ -8668,7 +8670,9 @@ intel_pipe_config_compare(const struct intel_crtc_state 
*current_config,
PIPE_CONF_CHECK_X(dpll_hw_state.mg_pll_ssc);
PIPE_CONF_CHECK_X(dpll_hw_state.mg_pll_bias);
PIPE_CONF_CHECK_X(dpll_hw_state.mg_pll_tdc_coldst_bias);
+   }
 
+   if (!PIPE_CONF_QUIRK(PIPE_CONFIG_QUIRK_BIGJOINER_SLAVE)) {
PIPE_CONF_CHECK_X(dsi_pll.ctrl);
PIPE_CONF_CHECK_X(dsi_pll.div);
 
diff --git a/drivers/gpu/drm/i915/display/intel_dpll_mgr.c 
b/drivers/gpu/drm/i915/display/intel_dpll_mgr.c
index 882bfd499e55..5688d9704636 100644
--- a/drivers/gpu/drm/i915/display/intel_dpll_mgr.c
+++ b/drivers/gpu/drm/i915/display/intel_dpll_mgr.c
@@ -4462,7 +4462,10 @@ void intel_shared_dpll_init(struct drm_device *dev)
const struct dpll_info *dpll_info;
int i;
 
-   if (IS_ALDERLAKE_P(dev_priv))
+   if (IS_DG2(dev_priv))
+   /* No shared DPLLs on DG2; port PLLs are part of the PHY */
+   dpll_mgr = NULL;
+   else if (IS_ALDERLAKE_P(dev_priv))
dpll_mgr = _pll_mgr;
else if (IS_ALDERLAKE_S(dev_priv))
dpll_mgr = _pll_mgr;
-- 
2.25.4



[PATCH 36/53] drm/i915/dg2: Don't wait for AUX power well enable ACKs

2021-07-01 Thread Matt Roper
On DG2 we're supposed to just wait 600us after programming the well
before moving on; there won't be an ack from the hardware.

Bspec: 49296
Signed-off-by: Matt Roper 
---
 .../gpu/drm/i915/display/intel_display_power.c   | 16 
 .../gpu/drm/i915/display/intel_display_power.h   |  6 ++
 2 files changed, 22 insertions(+)

diff --git a/drivers/gpu/drm/i915/display/intel_display_power.c 
b/drivers/gpu/drm/i915/display/intel_display_power.c
index 285380079aab..c34ff0947b85 100644
--- a/drivers/gpu/drm/i915/display/intel_display_power.c
+++ b/drivers/gpu/drm/i915/display/intel_display_power.c
@@ -341,6 +341,17 @@ static void hsw_wait_for_power_well_enable(struct 
drm_i915_private *dev_priv,
 {
const struct i915_power_well_regs *regs = power_well->desc->hsw.regs;
int pw_idx = power_well->desc->hsw.idx;
+   int enable_delay = power_well->desc->hsw.fixed_enable_delay;
+
+   /*
+* For some power wells we're not supposed to watch the status bit for
+* an ack, but rather just wait a fixed amount of time and then
+* proceed.  This is only used on DG2.
+*/
+   if (IS_DG2(dev_priv) && enable_delay) {
+   usleep_range(enable_delay, 2 * enable_delay);
+   return;
+   }
 
/* Timeout for PW1:10 us, AUX:not specified, other PWs:20 us. */
if (intel_de_wait_for_set(dev_priv, regs->driver,
@@ -4828,6 +4839,7 @@ static const struct i915_power_well_desc 
xelpd_power_wells[] = {
{
.hsw.regs = _aux_power_well_regs,
.hsw.idx = ICL_PW_CTL_IDX_AUX_A,
+   .hsw.fixed_enable_delay = 600,
},
},
{
@@ -4838,6 +4850,7 @@ static const struct i915_power_well_desc 
xelpd_power_wells[] = {
{
.hsw.regs = _aux_power_well_regs,
.hsw.idx = ICL_PW_CTL_IDX_AUX_B,
+   .hsw.fixed_enable_delay = 600,
},
},
{
@@ -4848,6 +4861,7 @@ static const struct i915_power_well_desc 
xelpd_power_wells[] = {
{
.hsw.regs = _aux_power_well_regs,
.hsw.idx = ICL_PW_CTL_IDX_AUX_C,
+   .hsw.fixed_enable_delay = 600,
},
},
{
@@ -4858,6 +4872,7 @@ static const struct i915_power_well_desc 
xelpd_power_wells[] = {
{
.hsw.regs = _aux_power_well_regs,
.hsw.idx = XELPD_PW_CTL_IDX_AUX_D,
+   .hsw.fixed_enable_delay = 600,
},
},
{
@@ -4878,6 +4893,7 @@ static const struct i915_power_well_desc 
xelpd_power_wells[] = {
{
.hsw.regs = _aux_power_well_regs,
.hsw.idx = TGL_PW_CTL_IDX_AUX_TC1,
+   .hsw.fixed_enable_delay = 600,
},
},
{
diff --git a/drivers/gpu/drm/i915/display/intel_display_power.h 
b/drivers/gpu/drm/i915/display/intel_display_power.h
index 4f0917df4375..22367b5cba96 100644
--- a/drivers/gpu/drm/i915/display/intel_display_power.h
+++ b/drivers/gpu/drm/i915/display/intel_display_power.h
@@ -223,6 +223,12 @@ struct i915_power_well_desc {
u8 idx;
/* Mask of pipes whose IRQ logic is backed by the pw */
u8 irq_pipe_mask;
+   /*
+* Instead of waiting for the status bit to ack enables,
+* just wait a specific amount of time and then consider
+* the well enabled.
+*/
+   u16 fixed_enable_delay;
/* The pw is backing the VGA functionality */
bool has_vga:1;
bool has_fuses:1;
-- 
2.25.4



[PATCH 49/53] drm/i915/dg2: Add DG2 to the PSR2 defeature list

2021-07-01 Thread Matt Roper
From: José Roberto de Souza 

PSR2 is not supported on DG2.

Cc: Caz Yokoyama 
Cc: Gwan-gyeong Mun 
Signed-off-by: José Roberto de Souza 
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/display/intel_psr.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/display/intel_psr.c 
b/drivers/gpu/drm/i915/display/intel_psr.c
index 4ba5337064ea..422e48927b5b 100644
--- a/drivers/gpu/drm/i915/display/intel_psr.c
+++ b/drivers/gpu/drm/i915/display/intel_psr.c
@@ -866,7 +866,8 @@ static bool intel_psr2_config_valid(struct intel_dp 
*intel_dp,
}
 
/* Wa_16011181250 */
-   if (IS_ROCKETLAKE(dev_priv) || IS_ALDERLAKE_S(dev_priv)) {
+   if (IS_ROCKETLAKE(dev_priv) || IS_ALDERLAKE_S(dev_priv) ||
+   IS_DG2(dev_priv)) {
drm_dbg_kms(_priv->drm, "PSR2 is defeatured for this 
platform\n");
return false;
}
-- 
2.25.4



[PATCH 48/53] drm/i915/dg2: Update lane disable power state during PSR

2021-07-01 Thread Matt Roper
From: Gwan-gyeong Mun 

The PSR enable/disable sequences now require that we program an extra
register in the PHY to adjust the lane disable power setting.

Bspec: 49274
Bspec: 53885
Cc: Anusha Srivatsa 
Signed-off-by: Matt Roper 
Signed-off-by: Gwan-gyeong Mun 
---
 drivers/gpu/drm/i915/display/intel_psr.c  |  7 +++
 drivers/gpu/drm/i915/display/intel_snps_phy.c | 14 ++
 drivers/gpu/drm/i915/display/intel_snps_phy.h |  3 +++
 drivers/gpu/drm/i915/i915_reg.h   |  3 +++
 4 files changed, 27 insertions(+)

diff --git a/drivers/gpu/drm/i915/display/intel_psr.c 
b/drivers/gpu/drm/i915/display/intel_psr.c
index 9643624fe160..4ba5337064ea 100644
--- a/drivers/gpu/drm/i915/display/intel_psr.c
+++ b/drivers/gpu/drm/i915/display/intel_psr.c
@@ -32,6 +32,7 @@
 #include "intel_dp_aux.h"
 #include "intel_hdmi.h"
 #include "intel_psr.h"
+#include "intel_snps_phy.h"
 #include "intel_sprite.h"
 #include "skl_universal_plane.h"
 
@@ -1206,6 +1207,7 @@ static void intel_psr_enable_locked(struct intel_dp 
*intel_dp,
 {
struct intel_digital_port *dig_port = dp_to_dig_port(intel_dp);
struct drm_i915_private *dev_priv = dp_to_i915(intel_dp);
+   enum phy phy = intel_port_to_phy(dev_priv, dig_port->base.port);
struct intel_encoder *encoder = _port->base;
u32 val;
 
@@ -1231,6 +1233,7 @@ static void intel_psr_enable_locked(struct intel_dp 
*intel_dp,
intel_dp_compute_psr_vsc_sdp(intel_dp, crtc_state, conn_state,
 _dp->psr.vsc);
intel_write_dp_vsc_sdp(encoder, crtc_state, _dp->psr.vsc);
+   intel_snps_phy_update_psr_power_state(dev_priv, phy, true);
intel_psr_enable_sink(intel_dp);
intel_psr_enable_source(intel_dp);
intel_dp->psr.enabled = true;
@@ -1327,6 +1330,8 @@ static void intel_psr_wait_exit_locked(struct intel_dp 
*intel_dp)
 static void intel_psr_disable_locked(struct intel_dp *intel_dp)
 {
struct drm_i915_private *dev_priv = dp_to_i915(intel_dp);
+   enum phy phy = intel_port_to_phy(dev_priv,
+dp_to_dig_port(intel_dp)->base.port);
 
lockdep_assert_held(_dp->psr.lock);
 
@@ -1353,6 +1358,8 @@ static void intel_psr_disable_locked(struct intel_dp 
*intel_dp)
 
TRANS_SET_CONTEXT_LATENCY(intel_dp->psr.transcoder),
 TRANS_SET_CONTEXT_LATENCY_MASK, 0);
 
+   intel_snps_phy_update_psr_power_state(dev_priv, phy, false);
+
/* Disable PSR on Sink */
drm_dp_dpcd_writeb(_dp->aux, DP_PSR_EN_CFG, 0);
 
diff --git a/drivers/gpu/drm/i915/display/intel_snps_phy.c 
b/drivers/gpu/drm/i915/display/intel_snps_phy.c
index f0c30d3d2dfb..18b52b64af95 100644
--- a/drivers/gpu/drm/i915/display/intel_snps_phy.c
+++ b/drivers/gpu/drm/i915/display/intel_snps_phy.c
@@ -36,6 +36,20 @@ void intel_snps_phy_wait_for_calibration(struct 
drm_i915_private *dev_priv)
}
 }
 
+void intel_snps_phy_update_psr_power_state(struct drm_i915_private *dev_priv,
+  enum phy phy, bool enable)
+{
+   u32 val;
+
+   if (!intel_phy_is_snps(dev_priv, phy))
+   return;
+
+   val = REG_FIELD_PREP(SNPS_PHY_TX_REQ_LN_DIS_PWR_STATE_PSR,
+enable ? 2 : 3);
+   intel_uncore_rmw(_priv->uncore, SNPS_PHY_TX_REQ(phy),
+SNPS_PHY_TX_REQ_LN_DIS_PWR_STATE_PSR, val);
+}
+
 static const u32 dg2_ddi_translations[] = {
/* VS 0, pre-emph 0 */
REG_FIELD_PREP(SNPS_PHY_TX_EQ_MAIN, 26),
diff --git a/drivers/gpu/drm/i915/display/intel_snps_phy.h 
b/drivers/gpu/drm/i915/display/intel_snps_phy.h
index 6aa33ff729ec..6261ff88ef5c 100644
--- a/drivers/gpu/drm/i915/display/intel_snps_phy.h
+++ b/drivers/gpu/drm/i915/display/intel_snps_phy.h
@@ -12,8 +12,11 @@ struct drm_i915_private;
 struct intel_encoder;
 struct intel_crtc_state;
 struct intel_mpllb_state;
+enum phy;
 
 void intel_snps_phy_wait_for_calibration(struct drm_i915_private *dev_priv);
+void intel_snps_phy_update_psr_power_state(struct drm_i915_private *dev_priv,
+  enum phy phy, bool enable);
 
 int intel_mpllb_calc_state(struct intel_crtc_state *crtc_state,
   struct intel_encoder *encoder);
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index e3a165eb4fb6..9c5426aeddff 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -2340,6 +2340,9 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
 #define SNPS_PHY_REF_CONTROL(phy)  _MMIO_SNPS(phy, 0x168188)
 #define   SNPS_PHY_REF_CONTROL_REF_RANGE   REG_GENMASK(31, 27)
 
+#define SNPS_PHY_TX_REQ(phy)   _MMIO_SNPS(phy, 0x168200)
+#define   SNPS_PHY_TX_REQ_LN_DIS_PWR_STATE_PSR REG_GENMASK(31, 30)
+
 #define SNPS_PHY_TX_EQ(ln, phy)_MMIO_SNPS_LN(ln, phy, 
0x168300)
 #define   

[PATCH 53/53] drm/i915/dg2: Configure PCON in DP pre-enable path

2021-07-01 Thread Matt Roper
From: Ankit Nautiyal 

Add the functions to configure HDMI2.1 pcon for DG2, before DP link
training.

Signed-off-by: Ankit Nautiyal 
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/display/intel_ddi.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/i915/display/intel_ddi.c 
b/drivers/gpu/drm/i915/display/intel_ddi.c
index 5499a2975a0e..77f79f3269a1 100644
--- a/drivers/gpu/drm/i915/display/intel_ddi.c
+++ b/drivers/gpu/drm/i915/display/intel_ddi.c
@@ -2580,6 +2580,7 @@ static void dg2_ddi_pre_enable_dp(struct 
intel_atomic_state *state,
if (!is_mst)
intel_dp_set_power(intel_dp, DP_SET_POWER_D0);
 
+   intel_dp_configure_protocol_converter(intel_dp, crtc_state);
intel_dp_sink_set_decompression_state(intel_dp, crtc_state, true);
/*
 * DDI FEC: "anticipates enabling FEC encoding sets the FEC_READY bit
@@ -2587,6 +2588,8 @@ static void dg2_ddi_pre_enable_dp(struct 
intel_atomic_state *state,
 * training
 */
intel_dp_sink_set_fec_ready(intel_dp, crtc_state);
+   intel_dp_check_frl_training(intel_dp);
+   intel_dp_pcon_dsc_configure(intel_dp, crtc_state);
 
/*
 * 5.h Follow DisplayPort specification training sequence (see notes for
-- 
2.25.4



[PATCH 47/53] drm/i915/dg2: Wait for SNPS PHY calibration during display init

2021-07-01 Thread Matt Roper
Initialization of the PHY is handled by the hardware/firmware, but the
driver should wait up to 25ms for the PHY to report that its calibration
has completed.

Bspec: 49189
Bspec: 50107
Cc: Matt Atwood 
Signed-off-by: Matt Roper 
---
 .../gpu/drm/i915/display/intel_display_power.c|  5 +
 drivers/gpu/drm/i915/display/intel_snps_phy.c | 15 +++
 drivers/gpu/drm/i915/display/intel_snps_phy.h |  3 +++
 drivers/gpu/drm/i915/i915_reg.h   |  1 +
 4 files changed, 24 insertions(+)

diff --git a/drivers/gpu/drm/i915/display/intel_display_power.c 
b/drivers/gpu/drm/i915/display/intel_display_power.c
index df6358638fee..83bc2e691560 100644
--- a/drivers/gpu/drm/i915/display/intel_display_power.c
+++ b/drivers/gpu/drm/i915/display/intel_display_power.c
@@ -18,6 +18,7 @@
 #include "intel_pm.h"
 #include "intel_pps.h"
 #include "intel_sideband.h"
+#include "intel_snps_phy.h"
 #include "intel_tc.h"
 #include "intel_vga.h"
 
@@ -5899,6 +5900,10 @@ static void icl_display_core_init(struct 
drm_i915_private *dev_priv,
if (DISPLAY_VER(dev_priv) >= 12)
tgl_bw_buddy_init(dev_priv);
 
+   /* 8. Ensure PHYs have completed calibration and adaptation */
+   if (IS_DG2(dev_priv))
+   intel_snps_phy_wait_for_calibration(dev_priv);
+
if (resume && intel_dmc_has_payload(dev_priv))
intel_dmc_load_program(dev_priv);
 
diff --git a/drivers/gpu/drm/i915/display/intel_snps_phy.c 
b/drivers/gpu/drm/i915/display/intel_snps_phy.c
index 77759bda98a4..f0c30d3d2dfb 100644
--- a/drivers/gpu/drm/i915/display/intel_snps_phy.c
+++ b/drivers/gpu/drm/i915/display/intel_snps_phy.c
@@ -21,6 +21,21 @@
  * since it is not handled by the shared DPLL framework as on other platforms.
  */
 
+void intel_snps_phy_wait_for_calibration(struct drm_i915_private *dev_priv)
+{
+   enum phy phy;
+
+   for_each_phy_masked(phy, ~0) {
+   if (!intel_phy_is_snps(dev_priv, phy))
+   continue;
+
+   if (intel_de_wait_for_clear(dev_priv, ICL_PHY_MISC(phy),
+   DG2_PHY_DP_TX_ACK_MASK, 25))
+   DRM_ERROR("SNPS PHY %c failed to calibrate after 
25ms.\n",
+ phy);
+   }
+}
+
 static const u32 dg2_ddi_translations[] = {
/* VS 0, pre-emph 0 */
REG_FIELD_PREP(SNPS_PHY_TX_EQ_MAIN, 26),
diff --git a/drivers/gpu/drm/i915/display/intel_snps_phy.h 
b/drivers/gpu/drm/i915/display/intel_snps_phy.h
index 3ce92d424f66..6aa33ff729ec 100644
--- a/drivers/gpu/drm/i915/display/intel_snps_phy.h
+++ b/drivers/gpu/drm/i915/display/intel_snps_phy.h
@@ -8,10 +8,13 @@
 
 #include 
 
+struct drm_i915_private;
 struct intel_encoder;
 struct intel_crtc_state;
 struct intel_mpllb_state;
 
+void intel_snps_phy_wait_for_calibration(struct drm_i915_private *dev_priv);
+
 int intel_mpllb_calc_state(struct intel_crtc_state *crtc_state,
   struct intel_encoder *encoder);
 void intel_mpllb_enable(struct intel_encoder *encoder,
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 203056b9f02c..e3a165eb4fb6 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -12442,6 +12442,7 @@ enum skl_power_gate {
 _ICL_PHY_MISC_B)
 #define  ICL_PHY_MISC_MUX_DDID (1 << 28)
 #define  ICL_PHY_MISC_DE_IO_COMP_PWR_DOWN  (1 << 23)
+#define  DG2_PHY_DP_TX_ACK_MASKREG_GENMASK(23, 20)
 
 /* Icelake Display Stream Compression Registers */
 #define DSCA_PICTURE_PARAMETER_SET_0   _MMIO(0x6B200)
-- 
2.25.4



[PATCH 04/53] drm/i915/xehp: VDBOX/VEBOX fusing registers are enable-based

2021-07-01 Thread Matt Roper
From: Tvrtko Ursulin 

On Xe_HP the fusing register is renamed and changed to have the "enable"
semantics, but otherwise remains compatible (mmio address, bitmask
ranges) with older platforms.

To simplify things we do not add a new register definition but just stop
inverting the fusing masks before processing them.

Bspec: 33288
Cc: Daniele Ceraolo Spurio 
Signed-off-by: Tvrtko Ursulin 
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 88694822716a..151870d8fdd3 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -468,7 +468,14 @@ static intel_engine_mask_t init_engine_mask(struct 
intel_gt *gt)
if (GRAPHICS_VER(i915) < 11)
return info->engine_mask;
 
-   media_fuse = ~intel_uncore_read(uncore, GEN11_GT_VEBOX_VDBOX_DISABLE);
+   /*
+* On newer platforms the fusing register is called 'enable' and has
+* enable semantics, while on older platforms it is called 'disable'
+* and bits have disable semantices.
+*/
+   media_fuse = intel_uncore_read(uncore, GEN11_GT_VEBOX_VDBOX_DISABLE);
+   if (GRAPHICS_VER_FULL(i915) < IP_VER(12, 50))
+   media_fuse = ~media_fuse;
 
vdbox_mask = media_fuse & GEN11_GT_VDBOX_DISABLE_MASK;
vebox_mask = (media_fuse & GEN11_GT_VEBOX_DISABLE_MASK) >>
-- 
2.25.4



[PATCH 46/53] drm/i915/dg2: Classify DG2 PHY types

2021-07-01 Thread Matt Roper
Although the bspec labels four of DG2's outputs as "combo PHY," the
underlying PHYs in both cases are actually Synopsys PHYs that are
programmed completely differently than the traditional Intel "combo" PHY
units.  As such, we don't want intel_phy_is_combo to take us down legacy
programming paths, so just return false from it on DG2.  Instead add a
new intel_phy_is_snps() that will return true for all DG2 PHYs.

Cc: Anusha Srivatsa 
Cc: Matt Atwood 
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/display/intel_display.c | 26 +++-
 drivers/gpu/drm/i915/display/intel_display.h |  1 +
 2 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/display/intel_display.c 
b/drivers/gpu/drm/i915/display/intel_display.c
index cce520b6dfcf..9655f1b1b41b 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -3698,6 +3698,13 @@ bool intel_phy_is_combo(struct drm_i915_private 
*dev_priv, enum phy phy)
 {
if (phy == PHY_NONE)
return false;
+   else if (IS_DG2(dev_priv))
+   /*
+* DG2 outputs labelled as "combo PHY" in the bspec use
+* SNPS PHYs with completely different programming,
+* hence we always return false here.
+*/
+   return false;
else if (IS_ALDERLAKE_S(dev_priv))
return phy <= PHY_E;
else if (IS_DG1(dev_priv) || IS_ROCKETLAKE(dev_priv))
@@ -3712,7 +3719,10 @@ bool intel_phy_is_combo(struct drm_i915_private 
*dev_priv, enum phy phy)
 
 bool intel_phy_is_tc(struct drm_i915_private *dev_priv, enum phy phy)
 {
-   if (IS_ALDERLAKE_P(dev_priv))
+   if (IS_DG2(dev_priv))
+   /* DG2's "TC1" output uses a SNPS PHY */
+   return false;
+   else if (IS_ALDERLAKE_P(dev_priv))
return phy >= PHY_F && phy <= PHY_I;
else if (IS_TIGERLAKE(dev_priv))
return phy >= PHY_D && phy <= PHY_I;
@@ -3722,6 +3732,20 @@ bool intel_phy_is_tc(struct drm_i915_private *dev_priv, 
enum phy phy)
return false;
 }
 
+bool intel_phy_is_snps(struct drm_i915_private *dev_priv, enum phy phy)
+{
+   if (phy == PHY_NONE)
+   return false;
+   else if (IS_DG2(dev_priv))
+   /*
+* All four "combo" ports and the TC1 port (PHY E) use
+* Synopsis PHYs.
+*/
+   return phy <= PHY_E;
+
+   return false;
+}
+
 enum phy intel_port_to_phy(struct drm_i915_private *i915, enum port port)
 {
if (DISPLAY_VER(i915) >= 13 && port >= PORT_D_XELPD)
diff --git a/drivers/gpu/drm/i915/display/intel_display.h 
b/drivers/gpu/drm/i915/display/intel_display.h
index c9dbaf074d77..284936f0ddab 100644
--- a/drivers/gpu/drm/i915/display/intel_display.h
+++ b/drivers/gpu/drm/i915/display/intel_display.h
@@ -561,6 +561,7 @@ struct drm_display_mode *
 intel_encoder_current_mode(struct intel_encoder *encoder);
 bool intel_phy_is_combo(struct drm_i915_private *dev_priv, enum phy phy);
 bool intel_phy_is_tc(struct drm_i915_private *dev_priv, enum phy phy);
+bool intel_phy_is_snps(struct drm_i915_private *dev_priv, enum phy phy);
 enum tc_port intel_port_to_tc(struct drm_i915_private *dev_priv,
  enum port port);
 int intel_get_pipe_from_crtc_id_ioctl(struct drm_device *dev, void *data,
-- 
2.25.4



[PATCH 43/53] drm/i915/dg2: Add MPLLB programming for HDMI

2021-07-01 Thread Matt Roper
At the moment we don't have a proper algorithm that can be used to
calculate PHY settings for arbitrary HDMI link rates.  The PHY tables
here should support the regular modes of real-world HDMI monitors.

Bspec: 54032
Cc: Matt Atwood 
Signed-off-by: Matt Roper 
Signed-off-by: Vandita Kulkarni 
---
 drivers/gpu/drm/i915/display/intel_ddi.c  |  14 +-
 drivers/gpu/drm/i915/display/intel_display.c  |  47 +++
 drivers/gpu/drm/i915/display/intel_hdmi.c |  11 +
 drivers/gpu/drm/i915/display/intel_snps_phy.c | 286 +-
 drivers/gpu/drm/i915/display/intel_snps_phy.h |   7 +
 drivers/gpu/drm/i915/i915_reg.h   |   3 +
 6 files changed, 355 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_ddi.c 
b/drivers/gpu/drm/i915/display/intel_ddi.c
index 26a3aa73fcc4..929a95ddb316 100644
--- a/drivers/gpu/drm/i915/display/intel_ddi.c
+++ b/drivers/gpu/drm/i915/display/intel_ddi.c
@@ -51,6 +51,7 @@
 #include "intel_panel.h"
 #include "intel_pps.h"
 #include "intel_psr.h"
+#include "intel_snps_phy.h"
 #include "intel_sprite.h"
 #include "intel_tc.h"
 #include "intel_vdsc.h"
@@ -3745,6 +3746,15 @@ void intel_ddi_get_clock(struct intel_encoder *encoder,
 
_state->dpll_hw_state);
 }
 
+static void dg2_ddi_get_config(struct intel_encoder *encoder,
+   struct intel_crtc_state *crtc_state)
+{
+   intel_mpllb_readout_hw_state(encoder, _state->mpllb_state);
+   crtc_state->port_clock = intel_mpllb_calc_port_clock(encoder, 
_state->mpllb_state);
+
+   intel_ddi_get_config(encoder, crtc_state);
+}
+
 static void adls_ddi_get_config(struct intel_encoder *encoder,
struct intel_crtc_state *crtc_state)
 {
@@ -4606,7 +4616,9 @@ void intel_ddi_init(struct drm_i915_private *dev_priv, 
enum port port)
encoder->cloneable = 0;
encoder->pipe_mask = ~0;
 
-   if (IS_ALDERLAKE_S(dev_priv)) {
+   if (IS_DG2(dev_priv)) {
+   encoder->get_config = dg2_ddi_get_config;
+   } else if (IS_ALDERLAKE_S(dev_priv)) {
encoder->enable_clock = adls_ddi_enable_clock;
encoder->disable_clock = adls_ddi_disable_clock;
encoder->is_clock_enabled = adls_ddi_is_clock_enabled;
diff --git a/drivers/gpu/drm/i915/display/intel_display.c 
b/drivers/gpu/drm/i915/display/intel_display.c
index 91f6964ec406..cce520b6dfcf 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -9113,6 +9113,52 @@ verify_shared_dpll_state(struct intel_crtc *crtc,
}
 }
 
+static void
+verify_mpllb_state(struct intel_atomic_state *state,
+  struct intel_crtc_state *new_crtc_state)
+{
+   struct drm_i915_private *i915 = to_i915(state->base.dev);
+   struct intel_mpllb_state mpllb_hw_state = { 0 };
+   struct intel_mpllb_state *mpllb_sw_state = _crtc_state->mpllb_state;
+   struct intel_crtc *crtc = to_intel_crtc(new_crtc_state->uapi.crtc);
+   struct intel_encoder *encoder;
+
+   if (!IS_DG2(i915))
+   return;
+
+   if (!new_crtc_state->hw.active)
+   return;
+
+   encoder = intel_get_crtc_new_encoder(state, new_crtc_state);
+   intel_mpllb_readout_hw_state(encoder, _hw_state);
+
+#define MPLLB_CHECK(name) do { \
+   if (mpllb_sw_state->name != mpllb_hw_state.name) { \
+   pipe_config_mismatch(false, crtc, "MPLLB:" __stringify(name), \
+"(expected 0x%08x, found 0x%08x)", \
+mpllb_sw_state->name, \
+mpllb_hw_state.name); \
+   } \
+} while (0)
+
+   MPLLB_CHECK(mpllb_cp);
+   MPLLB_CHECK(mpllb_div);
+   MPLLB_CHECK(mpllb_div2);
+   MPLLB_CHECK(mpllb_fracn1);
+   MPLLB_CHECK(mpllb_fracn2);
+   MPLLB_CHECK(mpllb_sscen);
+   MPLLB_CHECK(mpllb_sscstep);
+
+   /*
+* ref_control is handled by the hardware/firemware and never
+* programmed by the software, but the proper values are supplied
+* in the bspec for verification purposes.
+*/
+   MPLLB_CHECK(ref_control);
+
+#undef MPLLB_CHECK
+}
+
 static void
 intel_modeset_verify_crtc(struct intel_crtc *crtc,
  struct intel_atomic_state *state,
@@ -9126,6 +9172,7 @@ intel_modeset_verify_crtc(struct intel_crtc *crtc,
verify_connector_state(state, crtc);
verify_crtc_state(crtc, old_crtc_state, new_crtc_state);
verify_shared_dpll_state(crtc, old_crtc_state, new_crtc_state);
+   verify_mpllb_state(state, new_crtc_state);
 }
 
 static void
diff --git a/drivers/gpu/drm/i915/display/intel_hdmi.c 
b/drivers/gpu/drm/i915/display/intel_hdmi.c
index 852af2b23540..b04685bb6439 100644
--- a/drivers/gpu/drm/i915/display/intel_hdmi.c
+++ b/drivers/gpu/drm/i915/display/intel_hdmi.c
@@ -51,6 +51,7 @@
 #include "intel_hdmi.h"
 

[PATCH 29/53] drm/i915/dg2: Add new LRI reg offsets

2021-07-01 Thread Matt Roper
From: Akeem G Abodunrin 

New LRI register offsets were introduced for DG2, this patch adds
those extra registers, and create new register table for setting offsets
to compare with HW generated context image - especially for gt_lrc test.
Also updates general purpose register with scratch offset for DG2, in
order to use it for live_lrc_fixed selftest.

Cc: Chris P Wilson 
Cc: Prathap Kumar Valsan 
Signed-off-by: Akeem G Abodunrin 
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/gt/intel_lrc.c | 85 -
 1 file changed, 83 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c 
b/drivers/gpu/drm/i915/gt/intel_lrc.c
index fee735e2a524..da7ac1d970af 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -226,6 +226,40 @@ static const u8 gen12_xcs_offsets[] = {
END
 };
 
+static const u8 dg2_xcs_offsets[] = {
+   NOP(1),
+   LRI(15, POSTED),
+   REG16(0x244),
+   REG(0x034),
+   REG(0x030),
+   REG(0x038),
+   REG(0x03c),
+   REG(0x168),
+   REG(0x140),
+   REG(0x110),
+   REG(0x1c0),
+   REG(0x1c4),
+   REG(0x1c8),
+   REG(0x180),
+   REG16(0x2b4),
+   REG(0x120),
+   REG(0x124),
+
+   NOP(1),
+   LRI(9, POSTED),
+   REG16(0x3a8),
+   REG16(0x28c),
+   REG16(0x288),
+   REG16(0x284),
+   REG16(0x280),
+   REG16(0x27c),
+   REG16(0x278),
+   REG16(0x274),
+   REG16(0x270),
+
+   END
+};
+
 static const u8 gen8_rcs_offsets[] = {
NOP(1),
LRI(14, POSTED),
@@ -525,6 +559,49 @@ static const u8 xehp_rcs_offsets[] = {
END
 };
 
+static const u8 dg2_rcs_offsets[] = {
+   NOP(1),
+   LRI(15, POSTED),
+   REG16(0x244),
+   REG(0x034),
+   REG(0x030),
+   REG(0x038),
+   REG(0x03c),
+   REG(0x168),
+   REG(0x140),
+   REG(0x110),
+   REG(0x1c0),
+   REG(0x1c4),
+   REG(0x1c8),
+   REG(0x180),
+   REG16(0x2b4),
+   REG(0x120),
+   REG(0x124),
+
+   NOP(1),
+   LRI(9, POSTED),
+   REG16(0x3a8),
+   REG16(0x28c),
+   REG16(0x288),
+   REG16(0x284),
+   REG16(0x280),
+   REG16(0x27c),
+   REG16(0x278),
+   REG16(0x274),
+   REG16(0x270),
+
+   LRI(3, POSTED),
+   REG(0x1b0),
+   REG16(0x5a8),
+   REG16(0x5ac),
+
+   NOP(6),
+   LRI(1, 0),
+   REG(0x0c8),
+
+   END
+};
+
 #undef END
 #undef REG16
 #undef REG
@@ -543,7 +620,9 @@ static const u8 *reg_offsets(const struct intel_engine_cs 
*engine)
   !intel_engine_has_relative_mmio(engine));
 
if (engine->class == RENDER_CLASS) {
-   if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 50))
+   if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 55))
+   return dg2_rcs_offsets;
+   else if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 50))
return xehp_rcs_offsets;
else if (GRAPHICS_VER(engine->i915) >= 12)
return gen12_rcs_offsets;
@@ -554,7 +633,9 @@ static const u8 *reg_offsets(const struct intel_engine_cs 
*engine)
else
return gen8_rcs_offsets;
} else {
-   if (GRAPHICS_VER(engine->i915) >= 12)
+   if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 55))
+   return dg2_xcs_offsets;
+   else if (GRAPHICS_VER(engine->i915) >= 12)
return gen12_xcs_offsets;
else if (GRAPHICS_VER(engine->i915) >= 9)
return gen9_xcs_offsets;
-- 
2.25.4



[PATCH 44/53] drm/i915/dg2: Add vswing programming for SNPS phys

2021-07-01 Thread Matt Roper
Vswing programming for SNPS PHYs is just a single step -- look up the
value that corresponds to the voltage level from a table and program it
into the SNPS_PHY_TX_EQ register.

Bspec: 53920
Cc: Matt Atwood 
Signed-off-by: Matt Roper 
Signed-off-by: Jani Nikula 
---
 drivers/gpu/drm/i915/display/intel_ddi.c  | 23 ++--
 drivers/gpu/drm/i915/display/intel_snps_phy.c | 54 +++
 drivers/gpu/drm/i915/display/intel_snps_phy.h |  4 ++
 drivers/gpu/drm/i915/i915_reg.h   |  5 ++
 4 files changed, 83 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_ddi.c 
b/drivers/gpu/drm/i915/display/intel_ddi.c
index 929a95ddb316..ade03cf41caa 100644
--- a/drivers/gpu/drm/i915/display/intel_ddi.c
+++ b/drivers/gpu/drm/i915/display/intel_ddi.c
@@ -1496,6 +1496,16 @@ static int intel_ddi_dp_level(struct intel_dp *intel_dp)
return translate_signal_level(intel_dp, signal_levels);
 }
 
+static void
+dg2_set_signal_levels(struct intel_dp *intel_dp,
+ const struct intel_crtc_state *crtc_state)
+{
+   struct intel_encoder *encoder = _to_dig_port(intel_dp)->base;
+   int level = intel_ddi_dp_level(intel_dp);
+
+   intel_snps_phy_ddi_vswing_sequence(encoder, level);
+}
+
 static void
 tgl_set_signal_levels(struct intel_dp *intel_dp,
  const struct intel_crtc_state *crtc_state)
@@ -2563,7 +2573,10 @@ static void tgl_ddi_pre_enable_dp(struct 
intel_atomic_state *state,
 */
 
/* 7.e Configure voltage swing and related IO settings */
-   tgl_ddi_vswing_sequence(encoder, crtc_state, level);
+   if (IS_DG2(dev_priv))
+   intel_snps_phy_ddi_vswing_sequence(encoder, level);
+   else
+   tgl_ddi_vswing_sequence(encoder, crtc_state, level);
 
/*
 * 7.f Combo PHY: Configure PORT_CL_DW10 Static Power Down to power up
@@ -3102,7 +3115,9 @@ static void intel_enable_ddi_hdmi(struct 
intel_atomic_state *state,
"[CONNECTOR:%d:%s] Failed to configure sink 
scrambling/TMDS bit clock ratio\n",
connector->base.id, connector->name);
 
-   if (DISPLAY_VER(dev_priv) >= 12)
+   if (IS_DG2(dev_priv))
+   intel_snps_phy_ddi_vswing_sequence(encoder, U32_MAX);
+   else if (DISPLAY_VER(dev_priv) >= 12)
tgl_ddi_vswing_sequence(encoder, crtc_state, level);
else if (DISPLAY_VER(dev_priv) == 11)
icl_ddi_vswing_sequence(encoder, crtc_state, level);
@@ -4075,7 +4090,9 @@ intel_ddi_init_dp_connector(struct intel_digital_port 
*dig_port)
dig_port->dp.set_link_train = intel_ddi_set_link_train;
dig_port->dp.set_idle_link_train = intel_ddi_set_idle_link_train;
 
-   if (DISPLAY_VER(dev_priv) >= 12)
+   if (IS_DG2(dev_priv))
+   dig_port->dp.set_signal_levels = dg2_set_signal_levels;
+   else if (DISPLAY_VER(dev_priv) >= 12)
dig_port->dp.set_signal_levels = tgl_set_signal_levels;
else if (DISPLAY_VER(dev_priv) >= 11)
dig_port->dp.set_signal_levels = icl_set_signal_levels;
diff --git a/drivers/gpu/drm/i915/display/intel_snps_phy.c 
b/drivers/gpu/drm/i915/display/intel_snps_phy.c
index 1317b4e94b50..77759bda98a4 100644
--- a/drivers/gpu/drm/i915/display/intel_snps_phy.c
+++ b/drivers/gpu/drm/i915/display/intel_snps_phy.c
@@ -21,6 +21,60 @@
  * since it is not handled by the shared DPLL framework as on other platforms.
  */
 
+static const u32 dg2_ddi_translations[] = {
+   /* VS 0, pre-emph 0 */
+   REG_FIELD_PREP(SNPS_PHY_TX_EQ_MAIN, 26),
+
+   /* VS 0, pre-emph 1 */
+   REG_FIELD_PREP(SNPS_PHY_TX_EQ_MAIN, 33) |
+   REG_FIELD_PREP(SNPS_PHY_TX_EQ_POST, 6),
+
+   /* VS 0, pre-emph 2 */
+   REG_FIELD_PREP(SNPS_PHY_TX_EQ_MAIN, 38) |
+   REG_FIELD_PREP(SNPS_PHY_TX_EQ_POST, 12),
+
+   /* VS 0, pre-emph 3 */
+   REG_FIELD_PREP(SNPS_PHY_TX_EQ_MAIN, 43) |
+   REG_FIELD_PREP(SNPS_PHY_TX_EQ_POST, 19),
+
+   /* VS 1, pre-emph 0 */
+   REG_FIELD_PREP(SNPS_PHY_TX_EQ_MAIN, 39),
+
+   /* VS 1, pre-emph 1 */
+   REG_FIELD_PREP(SNPS_PHY_TX_EQ_MAIN, 44) |
+   REG_FIELD_PREP(SNPS_PHY_TX_EQ_POST, 8),
+
+   /* VS 1, pre-emph 2 */
+   REG_FIELD_PREP(SNPS_PHY_TX_EQ_MAIN, 47) |
+   REG_FIELD_PREP(SNPS_PHY_TX_EQ_POST, 15),
+
+   /* VS 2, pre-emph 0 */
+   REG_FIELD_PREP(SNPS_PHY_TX_EQ_MAIN, 52),
+
+   /* VS 2, pre-emph 1 */
+   REG_FIELD_PREP(SNPS_PHY_TX_EQ_MAIN, 51) |
+   REG_FIELD_PREP(SNPS_PHY_TX_EQ_POST, 10),
+
+   /* VS 3, pre-emph 0 */
+   REG_FIELD_PREP(SNPS_PHY_TX_EQ_MAIN, 62),
+};
+
+void intel_snps_phy_ddi_vswing_sequence(struct intel_encoder *encoder,
+   u32 level)
+{
+   struct drm_i915_private *dev_priv = to_i915(encoder->base.dev);
+   enum phy phy = intel_port_to_phy(dev_priv, encoder->port);
+   int 

[PATCH 45/53] drm/i915/dg2: Update modeset sequences

2021-07-01 Thread Matt Roper
DG2 has some changes to the expected modesetting sequences when compared
to gen12.  Adjust our driver logic accordingly.  Although the DP
sequence is pretty similar to TGL's, there are some steps that change,
so let's split the handling for that out into a separate function.

Bspec: 54128
Cc: Lucas De Marchi 
Cc: Anusha Srivatsa 
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/display/intel_ddi.c | 135 +--
 1 file changed, 127 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_ddi.c 
b/drivers/gpu/drm/i915/display/intel_ddi.c
index ade03cf41caa..5499a2975a0e 100644
--- a/drivers/gpu/drm/i915/display/intel_ddi.c
+++ b/drivers/gpu/drm/i915/display/intel_ddi.c
@@ -172,14 +172,22 @@ void intel_wait_ddi_buf_idle(struct drm_i915_private 
*dev_priv,
 static void intel_wait_ddi_buf_active(struct drm_i915_private *dev_priv,
  enum port port)
 {
+   int ret;
+
/* Wait > 518 usecs for DDI_BUF_CTL to be non idle */
if (DISPLAY_VER(dev_priv) < 10) {
usleep_range(518, 1000);
return;
}
 
-   if (wait_for_us(!(intel_de_read(dev_priv, DDI_BUF_CTL(port)) &
- DDI_BUF_IS_IDLE), 500))
+   if (IS_DG2(dev_priv))
+   ret = wait_for_us(!(intel_de_read(dev_priv, DDI_BUF_CTL(port)) &
+   DDI_BUF_IS_IDLE), 1200);
+   else
+   ret = wait_for_us(!(intel_de_read(dev_priv, DDI_BUF_CTL(port)) &
+   DDI_BUF_IS_IDLE), 500);
+
+   if (ret)
drm_err(_priv->drm, "Timeout waiting for DDI BUF %c to get 
active\n",
port_name(port));
 }
@@ -2207,7 +2215,7 @@ void intel_ddi_sanitize_encoder_pll_mapping(struct 
intel_encoder *encoder)
ddi_clk_needed = false;
}
 
-   if (ddi_clk_needed || !encoder->disable_clock ||
+   if (ddi_clk_needed || !encoder->is_clock_enabled ||
!encoder->is_clock_enabled(encoder))
return;
 
@@ -2488,6 +2496,116 @@ static void intel_ddi_mso_configure(const struct 
intel_crtc_state *crtc_state)
 OVERLAP_PIXELS_MASK, dss1);
 }
 
+static void dg2_ddi_pre_enable_dp(struct intel_atomic_state *state,
+ struct intel_encoder *encoder,
+ const struct intel_crtc_state *crtc_state,
+ const struct drm_connector_state *conn_state)
+{
+   struct intel_dp *intel_dp = enc_to_intel_dp(encoder);
+   struct drm_i915_private *dev_priv = to_i915(encoder->base.dev);
+   enum phy phy = intel_port_to_phy(dev_priv, encoder->port);
+   struct intel_digital_port *dig_port = enc_to_dig_port(encoder);
+   bool is_mst = intel_crtc_has_type(crtc_state, INTEL_OUTPUT_DP_MST);
+   int level = intel_ddi_dp_level(intel_dp);
+
+   intel_dp_set_link_params(intel_dp, crtc_state->port_clock,
+crtc_state->lane_count);
+
+   /*
+* 1. Enable Power Wells
+*
+* This was handled at the beginning of intel_atomic_commit_tail(),
+* before we called down into this function.
+*/
+
+   /* 2. Enable Panel Power if PPS is required */
+   intel_pps_on(intel_dp);
+
+   /*
+* 3. Enable the port PLL.
+*/
+   intel_ddi_enable_clock(encoder, crtc_state);
+
+   /* 4. Enable IO power */
+   if (!intel_phy_is_tc(dev_priv, phy) ||
+   dig_port->tc_mode != TC_PORT_TBT_ALT)
+   dig_port->ddi_io_wakeref = intel_display_power_get(dev_priv,
+  
dig_port->ddi_io_power_domain);
+
+   /*
+* 5. The rest of the below are substeps under the bspec's "Enable and
+* Train Display Port" step.  Note that steps that are specific to
+* MST will be handled by intel_mst_pre_enable_dp() before/after it
+* calls into this function.  Also intel_mst_pre_enable_dp() only calls
+* us when active_mst_links==0, so any steps designated for "single
+* stream or multi-stream master transcoder" can just be performed
+* unconditionally here.
+*/
+
+   /*
+* 5.a Configure Transcoder Clock Select to direct the Port clock to the
+* Transcoder.
+*/
+   intel_ddi_enable_pipe_clock(encoder, crtc_state);
+
+   /* 5.b Not relevant to i915 for now */
+
+   /*
+* 5.c Configure TRANS_DDI_FUNC_CTL DDI Select, DDI Mode Select & MST
+* Transport Select
+*/
+   intel_ddi_config_transcoder_func(encoder, crtc_state);
+
+   /*
+* 5.d Configure & enable DP_TP_CTL with link training pattern 1
+* selected
+*
+* This will be handled by the intel_dp_start_link_train() farther
+* down this function.
+*/
+
+   /* 5.e Configure voltage swing and 

[PATCH 07/53] drm/i915/xehp: Extra media engines - Part 1 (engine definitions)

2021-07-01 Thread Matt Roper
From: John Harrison 

Xe_HP can have a lot of extra media engines. This patch adds the basic
definitions for them.

Cc: Tvrtko Ursulin 
Signed-off-by: John Harrison 
Signed-off-by: Tomas Winkler 
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/gt/gen8_engine_cs.c |  7 ++-
 drivers/gpu/drm/i915/gt/intel_engine_cs.c| 50 
 drivers/gpu/drm/i915/gt/intel_engine_types.h | 14 --
 drivers/gpu/drm/i915/i915_reg.h  |  6 +++
 4 files changed, 69 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c 
b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
index 87b06572fd2e..35edc55720f4 100644
--- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
@@ -279,7 +279,7 @@ int gen12_emit_flush_xcs(struct i915_request *rq, u32 mode)
if (mode & EMIT_INVALIDATE)
aux_inv = rq->engine->mask & ~BIT(BCS0);
if (aux_inv)
-   cmd += 2 * hweight8(aux_inv) + 2;
+   cmd += 2 * hweight32(aux_inv) + 2;
 
cs = intel_ring_begin(rq, cmd);
if (IS_ERR(cs))
@@ -313,9 +313,8 @@ int gen12_emit_flush_xcs(struct i915_request *rq, u32 mode)
struct intel_engine_cs *engine;
unsigned int tmp;
 
-   *cs++ = MI_LOAD_REGISTER_IMM(hweight8(aux_inv));
-   for_each_engine_masked(engine, rq->engine->gt,
-  aux_inv, tmp) {
+   *cs++ = MI_LOAD_REGISTER_IMM(hweight32(aux_inv));
+   for_each_engine_masked(engine, rq->engine->gt, aux_inv, tmp) {
*cs++ = i915_mmio_reg_offset(aux_inv_reg(engine));
*cs++ = AUX_INV;
}
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 4ab2c9abb943..6e2aa1acc4d4 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -104,6 +104,38 @@ static const struct engine_info intel_engines[] = {
{ .graphics_ver = 11, .base = GEN11_BSD4_RING_BASE }
},
},
+   [VCS4] = {
+   .hw_id = 0, /* not used in GEN12+, see MI_SEMAPHORE_SIGNAL */
+   .class = VIDEO_DECODE_CLASS,
+   .instance = 4,
+   .mmio_bases = {
+   { .graphics_ver = 11, .base = XEHP_BSD5_RING_BASE }
+   },
+   },
+   [VCS5] = {
+   .hw_id = 0, /* not used in GEN12+, see MI_SEMAPHORE_SIGNAL */
+   .class = VIDEO_DECODE_CLASS,
+   .instance = 5,
+   .mmio_bases = {
+   { .graphics_ver = 12, .base = XEHP_BSD6_RING_BASE }
+   },
+   },
+   [VCS6] = {
+   .hw_id = 0, /* not used in GEN12+, see MI_SEMAPHORE_SIGNAL */
+   .class = VIDEO_DECODE_CLASS,
+   .instance = 6,
+   .mmio_bases = {
+   { .graphics_ver = 12, .base = XEHP_BSD7_RING_BASE }
+   },
+   },
+   [VCS7] = {
+   .hw_id = 0, /* not used in GEN12+, see MI_SEMAPHORE_SIGNAL */
+   .class = VIDEO_DECODE_CLASS,
+   .instance = 7,
+   .mmio_bases = {
+   { .graphics_ver = 12, .base = XEHP_BSD8_RING_BASE }
+   },
+   },
[VECS0] = {
.hw_id = VECS0_HW,
.class = VIDEO_ENHANCEMENT_CLASS,
@@ -121,6 +153,22 @@ static const struct engine_info intel_engines[] = {
{ .graphics_ver = 11, .base = GEN11_VEBOX2_RING_BASE }
},
},
+   [VECS2] = {
+   .hw_id = 0, /* not used in GEN12+, see MI_SEMAPHORE_SIGNAL */
+   .class = VIDEO_ENHANCEMENT_CLASS,
+   .instance = 2,
+   .mmio_bases = {
+   { .graphics_ver = 12, .base = XEHP_VEBOX3_RING_BASE }
+   },
+   },
+   [VECS3] = {
+   .hw_id = 0, /* not used in GEN12+, see MI_SEMAPHORE_SIGNAL */
+   .class = VIDEO_ENHANCEMENT_CLASS,
+   .instance = 3,
+   .mmio_bases = {
+   { .graphics_ver = 12, .base = XEHP_VEBOX4_RING_BASE }
+   },
+   },
 };
 
 /**
@@ -269,6 +317,8 @@ static int intel_engine_setup(struct intel_gt *gt, enum 
intel_engine_id id)
 
BUILD_BUG_ON(MAX_ENGINE_CLASS >= BIT(GEN11_ENGINE_CLASS_WIDTH));
BUILD_BUG_ON(MAX_ENGINE_INSTANCE >= BIT(GEN11_ENGINE_INSTANCE_WIDTH));
+   BUILD_BUG_ON(I915_MAX_VCS > (MAX_ENGINE_INSTANCE + 1));
+   BUILD_BUG_ON(I915_MAX_VECS > (MAX_ENGINE_INSTANCE + 1));
 
if (GEM_DEBUG_WARN_ON(id >= ARRAY_SIZE(gt->engine)))
return -EINVAL;
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h 
b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index 5b91068ab277..b25f594a7e4b 100644
--- 

[PATCH 32/53] drm/i915/dg2: Define MOCS table for DG2

2021-07-01 Thread Matt Roper
Bspec: 45101, 45427
Cc: Ramalingam C (v5)
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/gt/intel_mocs.c | 35 +++-
 1 file changed, 34 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_mocs.c 
b/drivers/gpu/drm/i915/gt/intel_mocs.c
index 0c9d0b936c20..d22ca8212092 100644
--- a/drivers/gpu/drm/i915/gt/intel_mocs.c
+++ b/drivers/gpu/drm/i915/gt/intel_mocs.c
@@ -341,6 +341,30 @@ static const struct drm_i915_mocs_entry 
xehpsdv_mocs_table[] = {
MOCS_ENTRY(63, 0, L3_1_UC),
 };
 
+static const struct drm_i915_mocs_entry dg2_mocs_table[] = {
+   /* UC - Coherent; GO:L3 */
+   MOCS_ENTRY(0, 0, L3_1_UC | L3_LKUP(1)),
+   /* UC - Coherent; GO:Memory */
+   MOCS_ENTRY(1, 0, L3_1_UC | L3_GLBGO(1) | L3_LKUP(1)),
+   /* UC - Non-Coherent; GO:Memory */
+   MOCS_ENTRY(2, 0, L3_1_UC | L3_GLBGO(1)),
+
+   /* WB - LC */
+   MOCS_ENTRY(3, 0, L3_3_WB | L3_LKUP(1)),
+};
+
+static const struct drm_i915_mocs_entry dg2_mocs_table_g10_ax[] = {
+   /* Wa_14011441408: Set Go to Memory for MOCS#0 */
+   MOCS_ENTRY(0, 0, L3_1_UC | L3_GLBGO(1) | L3_LKUP(1)),
+   /* UC - Coherent; GO:Memory */
+   MOCS_ENTRY(1, 0, L3_1_UC | L3_GLBGO(1) | L3_LKUP(1)),
+   /* UC - Non-Coherent; GO:Memory */
+   MOCS_ENTRY(2, 0, L3_1_UC | L3_GLBGO(1)),
+
+   /* WB - LC */
+   MOCS_ENTRY(3, 0, L3_3_WB | L3_LKUP(1)),
+};
+
 enum {
HAS_GLOBAL_MOCS = BIT(0),
HAS_ENGINE_MOCS = BIT(1),
@@ -367,7 +391,16 @@ static unsigned int get_mocs_settings(const struct 
drm_i915_private *i915,
 {
unsigned int flags;
 
-   if (IS_XEHPSDV(i915)) {
+   if (IS_DG2(i915)) {
+   if (IS_DG2_GT_STEP(i915, G10, STEP_A0, (STEP_B0 - 1))) {
+   table->size = ARRAY_SIZE(dg2_mocs_table_g10_ax);
+   table->table = dg2_mocs_table_g10_ax;
+   } else {
+   table->size = ARRAY_SIZE(dg2_mocs_table);
+   table->table = dg2_mocs_table;
+   }
+   table->n_entries = GEN9_NUM_MOCS_ENTRIES;
+   } else if (IS_XEHPSDV(i915)) {
table->size = ARRAY_SIZE(xehpsdv_mocs_table);
table->table = xehpsdv_mocs_table;
table->n_entries = GEN9_NUM_MOCS_ENTRIES;
-- 
2.25.4



[PATCH 22/53] drm/i915/xehpsdv: factor out function to read RP_STATE_CAP

2021-07-01 Thread Matt Roper
From: Lucas De Marchi 

Instead of maintaining the same if ladder in 3 different places, add a
function to read RP_STATE_CAP.

Signed-off-by: Lucas De Marchi 
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/gt/debugfs_gt_pm.c |  8 +++-
 drivers/gpu/drm/i915/gt/intel_rps.c | 17 -
 drivers/gpu/drm/i915/gt/intel_rps.h |  1 +
 drivers/gpu/drm/i915/i915_debugfs.c |  8 +++-
 4 files changed, 19 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/debugfs_gt_pm.c 
b/drivers/gpu/drm/i915/gt/debugfs_gt_pm.c
index 4270b5a34a83..1061a62bdfce 100644
--- a/drivers/gpu/drm/i915/gt/debugfs_gt_pm.c
+++ b/drivers/gpu/drm/i915/gt/debugfs_gt_pm.c
@@ -309,13 +309,11 @@ static int frequency_show(struct seq_file *m, void 
*unused)
int max_freq;
 
rp_state_limits = intel_uncore_read(uncore, 
GEN6_RP_STATE_LIMITS);
-   if (IS_GEN9_LP(i915)) {
-   rp_state_cap = intel_uncore_read(uncore, 
BXT_RP_STATE_CAP);
+   rp_state_cap = intel_rps_read_state_cap(rps);
+   if (IS_GEN9_LP(i915))
gt_perf_status = intel_uncore_read(uncore, 
BXT_GT_PERF_STATUS);
-   } else {
-   rp_state_cap = intel_uncore_read(uncore, 
GEN6_RP_STATE_CAP);
+   else
gt_perf_status = intel_uncore_read(uncore, 
GEN6_GT_PERF_STATUS);
-   }
 
/* RPSTAT1 is in the GT power well */
intel_uncore_forcewake_get(uncore, FORCEWAKE_ALL);
diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c 
b/drivers/gpu/drm/i915/gt/intel_rps.c
index 06e9a8ed4e03..490bc1513480 100644
--- a/drivers/gpu/drm/i915/gt/intel_rps.c
+++ b/drivers/gpu/drm/i915/gt/intel_rps.c
@@ -975,20 +975,16 @@ int intel_rps_set(struct intel_rps *rps, u8 val)
 static void gen6_rps_init(struct intel_rps *rps)
 {
struct drm_i915_private *i915 = rps_to_i915(rps);
-   struct intel_uncore *uncore = rps_to_uncore(rps);
+   u32 rp_state_cap = intel_rps_read_state_cap(rps);
 
/* All of these values are in units of 50MHz */
 
/* static values from HW: RP0 > RP1 > RPn (min_freq) */
if (IS_GEN9_LP(i915)) {
-   u32 rp_state_cap = intel_uncore_read(uncore, BXT_RP_STATE_CAP);
-
rps->rp0_freq = (rp_state_cap >> 16) & 0xff;
rps->rp1_freq = (rp_state_cap >>  8) & 0xff;
rps->min_freq = (rp_state_cap >>  0) & 0xff;
} else {
-   u32 rp_state_cap = intel_uncore_read(uncore, GEN6_RP_STATE_CAP);
-
rps->rp0_freq = (rp_state_cap >>  0) & 0xff;
rps->rp1_freq = (rp_state_cap >>  8) & 0xff;
rps->min_freq = (rp_state_cap >> 16) & 0xff;
@@ -1936,6 +1932,17 @@ u32 intel_rps_read_actual_frequency(struct intel_rps 
*rps)
return freq;
 }
 
+u32 intel_rps_read_state_cap(struct intel_rps *rps)
+{
+   struct drm_i915_private *i915 = rps_to_i915(rps);
+   struct intel_uncore *uncore = rps_to_uncore(rps);
+
+   if (IS_GEN9_LP(i915))
+   return intel_uncore_read(uncore, BXT_RP_STATE_CAP);
+   else
+   return intel_uncore_read(uncore, GEN6_RP_STATE_CAP);
+}
+
 /* External interface for intel_ips.ko */
 
 static struct drm_i915_private __rcu *ips_mchdev;
diff --git a/drivers/gpu/drm/i915/gt/intel_rps.h 
b/drivers/gpu/drm/i915/gt/intel_rps.h
index 1d2cfc98b510..6e06dd61f818 100644
--- a/drivers/gpu/drm/i915/gt/intel_rps.h
+++ b/drivers/gpu/drm/i915/gt/intel_rps.h
@@ -31,6 +31,7 @@ int intel_gpu_freq(struct intel_rps *rps, int val);
 int intel_freq_opcode(struct intel_rps *rps, int val);
 u32 intel_rps_get_cagf(struct intel_rps *rps, u32 rpstat1);
 u32 intel_rps_read_actual_frequency(struct intel_rps *rps);
+u32 intel_rps_read_state_cap(struct intel_rps *rps);
 
 void gen5_rps_irq_handler(struct intel_rps *rps);
 void gen6_rps_irq_handler(struct intel_rps *rps, u32 pm_iir);
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c 
b/drivers/gpu/drm/i915/i915_debugfs.c
index cc745751ac53..6c83da3956b9 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -420,13 +420,11 @@ static int i915_frequency_info(struct seq_file *m, void 
*unused)
int max_freq;
 
rp_state_limits = intel_uncore_read(_priv->uncore, 
GEN6_RP_STATE_LIMITS);
-   if (IS_GEN9_LP(dev_priv)) {
-   rp_state_cap = intel_uncore_read(_priv->uncore, 
BXT_RP_STATE_CAP);
+   rp_state_cap = intel_rps_read_state_cap(rps);
+   if (IS_GEN9_LP(dev_priv))
gt_perf_status = intel_uncore_read(_priv->uncore, 
BXT_GT_PERF_STATUS);
-   } else {
-   rp_state_cap = intel_uncore_read(_priv->uncore, 
GEN6_RP_STATE_CAP);
+   else
gt_perf_status = intel_uncore_read(_priv->uncore, 
GEN6_GT_PERF_STATUS);
-   }
 
   

[PATCH 51/53] drm/i915/display/dsc: Set BPP in the kernel

2021-07-01 Thread Matt Roper
From: Anusha Srivatsa 

Set compress BPP in kernel while connector DP or eDP

Cc: Vandita Kulkarni 
Cc: Navare Manasi D 
Signed-off-by: Anusha Srivatsa 
Signed-off-by: Patnana Venkata Sai 
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/display/intel_dp.c | 23 ++-
 1 file changed, 18 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_dp.c 
b/drivers/gpu/drm/i915/display/intel_dp.c
index 5b52beaddada..57aadee69d8b 100644
--- a/drivers/gpu/drm/i915/display/intel_dp.c
+++ b/drivers/gpu/drm/i915/display/intel_dp.c
@@ -1241,9 +1241,15 @@ static int intel_dp_dsc_compute_config(struct intel_dp 
*intel_dp,
pipe_config->lane_count = limits->max_lane_count;
 
if (intel_dp_is_edp(intel_dp)) {
-   pipe_config->dsc.compressed_bpp =
-   min_t(u16, 
drm_edp_dsc_sink_output_bpp(intel_dp->dsc_dpcd) >> 4,
- pipe_config->pipe_bpp);
+   if (intel_dp->force_dsc_bpp) {
+   drm_dbg_kms(_priv->drm,
+   "DSC BPC forced to %d", 
intel_dp->force_dsc_bpp);
+   pipe_config->dsc.compressed_bpp = 
intel_dp->force_dsc_bpp;
+   } else {
+   pipe_config->dsc.compressed_bpp =
+   min_t(u16, 
drm_edp_dsc_sink_output_bpp(intel_dp->dsc_dpcd) >> 4,
+   pipe_config->pipe_bpp);
+   }
pipe_config->dsc.slice_count =
drm_dp_dsc_sink_max_slice_count(intel_dp->dsc_dpcd,
true);
@@ -1269,9 +1275,15 @@ static int intel_dp_dsc_compute_config(struct intel_dp 
*intel_dp,
"Compressed BPP/Slice Count not 
supported\n");
return -EINVAL;
}
-   pipe_config->dsc.compressed_bpp = min_t(u16,
+   if (intel_dp->force_dsc_bpp) {
+   drm_dbg_kms(_priv->drm,
+   "DSC BPC forced to %d\n", 
intel_dp->force_dsc_bpp);
+   pipe_config->dsc.compressed_bpp = 
intel_dp->force_dsc_bpp;
+   } else {
+   pipe_config->dsc.compressed_bpp = min_t(u16,
   
dsc_max_output_bpp >> 4,
   
pipe_config->pipe_bpp);
+   }
pipe_config->dsc.slice_count = dsc_dp_slice_count;
}
/*
@@ -1374,7 +1386,8 @@ intel_dp_compute_link_config(struct intel_encoder 
*encoder,
 * Pipe joiner needs compression upto display12 due to BW limitation. 
DG2
 * onwards pipe joiner can be enabled without compression.
 */
-   drm_dbg_kms(>drm, "Force DSC en = %d\n", intel_dp->force_dsc_en);
+   drm_dbg_kms(>drm, "Force DSC en = %d\n Force DSC BPP = %d\n",
+   intel_dp->force_dsc_en, intel_dp->force_dsc_bpp);
if (ret || intel_dp->force_dsc_en || (DISPLAY_VER(i915) < 13 &&
  pipe_config->bigjoiner)) {
ret = intel_dp_dsc_compute_config(intel_dp, pipe_config,
-- 
2.25.4



[PATCH 50/53] drm/i915/display/dsc: Add Per connector debugfs node for DSC BPP enable

2021-07-01 Thread Matt Roper
From: Anusha Srivatsa 

DSC can be supported per DP connector. This patch creates
a per connector debugfs node to expose the Input and
Compressed BPP.

The same node can be used from userspace to force
DSC to a certain BPP.

force_dsc_bpp is written through this debugfs
node to force DSC BPP to all accepted values

Cc: Vandita Kulkarni 
Cc: Manasi Navare 
Signed-off-by: Anusha Srivatsa 
Signed-off-by: Patnana Venkata Sai 
Signed-off-by: Matt Roper 
---
 .../drm/i915/display/intel_display_debugfs.c  | 103 +-
 .../drm/i915/display/intel_display_types.h|   1 +
 2 files changed, 103 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/display/intel_display_debugfs.c 
b/drivers/gpu/drm/i915/display/intel_display_debugfs.c
index af9e58619667..1805d70ea817 100644
--- a/drivers/gpu/drm/i915/display/intel_display_debugfs.c
+++ b/drivers/gpu/drm/i915/display/intel_display_debugfs.c
@@ -2389,6 +2389,100 @@ static const struct file_operations 
i915_dsc_fec_support_fops = {
.write = i915_dsc_fec_support_write
 };
 
+static int i915_dsc_bpp_support_show(struct seq_file *m, void *data)
+{
+   struct drm_connector *connector = m->private;
+   struct drm_device *dev = connector->dev;
+   struct drm_crtc *crtc;
+   struct intel_dp *intel_dp;
+   struct drm_modeset_acquire_ctx ctx;
+   struct intel_crtc_state *crtc_state = NULL;
+   int ret = 0;
+   bool try_again = false;
+
+   drm_modeset_acquire_init(, DRM_MODESET_ACQUIRE_INTERRUPTIBLE);
+
+   do {
+   try_again = false;
+   ret = drm_modeset_lock(>mode_config.connection_mutex,
+  );
+   if (ret) {
+   ret = -EINTR;
+   break;
+   }
+   crtc = connector->state->crtc;
+   if (connector->status != connector_status_connected || !crtc) {
+   ret = -ENODEV;
+   break;
+   }
+   ret = drm_modeset_lock(>mutex, );
+   if (ret == -EDEADLK) {
+   ret = drm_modeset_backoff();
+   if (!ret) {
+   try_again = true;
+   continue;
+   }
+   break;
+   } else if (ret) {
+   break;
+   }
+   intel_dp = intel_attached_dp(to_intel_connector(connector));
+   crtc_state = to_intel_crtc_state(crtc->state);
+   seq_printf(m, "Input_BPP: %d\n", crtc_state->pipe_bpp);
+   seq_printf(m, "Compressed_BPP: %d\n",
+   crtc_state->dsc.compressed_bpp);
+   } while (try_again);
+
+   drm_modeset_drop_locks();
+   drm_modeset_acquire_fini();
+
+   return ret;
+}
+
+static ssize_t i915_dsc_bpp_support_write(struct file *file,
+   const char __user *ubuf,
+   size_t len, loff_t *offp)
+{
+   int dsc_bpp = 0;
+   int ret;
+   struct drm_connector *connector =
+   ((struct seq_file *)file->private_data)->private;
+   struct intel_encoder *encoder = 
intel_attached_encoder(to_intel_connector(connector));
+   struct drm_i915_private *i915 = to_i915(encoder->base.dev);
+   struct intel_dp *intel_dp = enc_to_intel_dp(encoder);
+
+   if (len == 0)
+   return 0;
+
+   drm_dbg(>drm,
+   "Copied %zu bytes from user to force BPP\n", len);
+
+   ret = kstrtoint_from_user(ubuf, len, 0, _bpp);
+
+   intel_dp->force_dsc_bpp = dsc_bpp;
+   if (ret < 0)
+   return ret;
+
+   *offp += len;
+   return len;
+}
+
+static int i915_dsc_bpp_support_open(struct inode *inode,
+  struct file *file)
+{
+   return single_open(file, i915_dsc_bpp_support_show,
+  inode->i_private);
+}
+
+static const struct file_operations i915_dsc_bpp_support_fops = {
+   .owner = THIS_MODULE,
+   .open = i915_dsc_bpp_support_open,
+   .read = seq_read,
+   .llseek = seq_lseek,
+   .release = single_release,
+   .write = i915_dsc_bpp_support_write
+};
+
 /**
  * intel_connector_debugfs_add - add i915 specific connector debugfs files
  * @connector: pointer to a registered drm_connector
@@ -2427,9 +2521,16 @@ int intel_connector_debugfs_add(struct drm_connector 
*connector)
connector, _hdcp_sink_capability_fops);
}
 
-   if ((DISPLAY_VER(dev_priv) >= 11 || IS_CANNONLAKE(dev_priv)) && 
((connector->connector_type == DRM_MODE_CONNECTOR_DisplayPort && 
!to_intel_connector(connector)->mst_port) || connector->connector_type == 
DRM_MODE_CONNECTOR_eDP))
+   if ((DISPLAY_VER(dev_priv) >= 11 || IS_CANNONLAKE(dev_priv)) &&
+   ((connector->connector_type == 

[PATCH 20/53] drm/i915/xehpsdv: Define steering tables

2021-07-01 Thread Matt Roper
Define and initialize the MMIO ranges for which XeHP SDV requires MSLICE
and LNCF steering.

Bspec: 66534
Cc: Tvrtko Ursulin 
Cc: Daniele Ceraolo Spurio 
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/gt/intel_gt.c  | 19 ++-
 drivers/gpu/drm/i915/gt/intel_workarounds.c | 11 +--
 2 files changed, 27 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c 
b/drivers/gpu/drm/i915/gt/intel_gt.c
index f59bcedbb80b..9d1c99c9c0dd 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
@@ -89,6 +89,20 @@ static const struct intel_mmio_range 
icl_l3bank_steering_table[] = {
{},
 };
 
+static const struct intel_mmio_range xehpsdv_mslice_steering_table[] = {
+   { 0x004000, 0x004AFF },
+   { 0x00C800, 0x00CFFF },
+   { 0x00DD00, 0x00DDFF },
+   { 0x00E900, 0x00 }, /* 0xEA00 - OxEFFF is unused */
+   {},
+};
+
+static const struct intel_mmio_range xehpsdv_lncf_steering_table[] = {
+   { 0x00B000, 0x00B0FF },
+   { 0x00D800, 0x00D8FF },
+   {},
+};
+
 static u16 slicemask(struct intel_gt *gt, int count)
 {
u64 dss_mask = intel_sseu_get_subslices(>info.sseu, 0);
@@ -113,7 +127,10 @@ int intel_gt_init_mmio(struct intel_gt *gt)
(intel_uncore_read(gt->uncore, GEN10_MIRROR_FUSE3) &
 GEN12_MEML3_EN_MASK);
 
-   if (GRAPHICS_VER(gt->i915) >= 11 &&
+   if (IS_XEHPSDV(gt->i915)) {
+   gt->steering_table[MSLICE] = xehpsdv_mslice_steering_table;
+   gt->steering_table[LNCF] = xehpsdv_lncf_steering_table;
+   } else if (GRAPHICS_VER(gt->i915) >= 11 &&
   GRAPHICS_VER_FULL(gt->i915) < IP_VER(12, 50)) {
gt->steering_table[L3BANK] = icl_l3bank_steering_table;
gt->info.l3bank_mask =
diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c 
b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index 060d84897635..4302dc1b728e 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -989,7 +989,6 @@ icl_wa_init_mcr(struct drm_i915_private *i915, struct 
i915_wa_list *wal)
__add_mcr_wa(i915, wal, slice, subslice);
 }
 
-__maybe_unused
 static void
 xehp_init_mcr(struct intel_gt *gt, struct i915_wa_list *wal)
 {
@@ -1208,10 +1207,18 @@ dg1_gt_workarounds_init(struct drm_i915_private *i915, 
struct i915_wa_list *wal)
VSUNIT_CLKGATE_DIS_TGL);
 }
 
+static void
+xehpsdv_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list 
*wal)
+{
+   xehp_init_mcr(>gt, wal);
+}
+
 static void
 gt_init_workarounds(struct drm_i915_private *i915, struct i915_wa_list *wal)
 {
-   if (IS_DG1(i915))
+   if (IS_XEHPSDV(i915))
+   xehpsdv_gt_workarounds_init(i915, wal);
+   else if (IS_DG1(i915))
dg1_gt_workarounds_init(i915, wal);
else if (IS_TIGERLAKE(i915))
tgl_gt_workarounds_init(i915, wal);
-- 
2.25.4



[PATCH 21/53] drm/i915/xehpsdv: Define MOCS table for XeHP SDV

2021-07-01 Thread Matt Roper
From: Lucas De Marchi 

Like DG1, XeHP SDV doesn't have LLC/eDRAM control values due to being a
dgfx card. XeHP SDV adds 2 more bits: L3_GLBGO to "push the Go point to
memory for L3 destined transaction" and L3_LKP to "enable Lookup for
uncacheable accesses".

Bspec: 45101
Cc: Daniele Ceraolo Spurio 
Signed-off-by: Lucas De Marchi 
Signed-off-by: Stuart Summers 
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/gt/intel_mocs.c | 33 +++-
 1 file changed, 32 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_mocs.c 
b/drivers/gpu/drm/i915/gt/intel_mocs.c
index 17848807f111..0c9d0b936c20 100644
--- a/drivers/gpu/drm/i915/gt/intel_mocs.c
+++ b/drivers/gpu/drm/i915/gt/intel_mocs.c
@@ -40,6 +40,8 @@ struct drm_i915_mocs_table {
 #define L3_ESC(value)  ((value) << 0)
 #define L3_SCC(value)  ((value) << 1)
 #define _L3_CACHEABILITY(value)((value) << 4)
+#define L3_GLBGO(value)((value) << 6)
+#define L3_LKUP(value) ((value) << 7)
 
 /* Helper defines */
 #define GEN9_NUM_MOCS_ENTRIES  64  /* 63-64 are reserved, but configured. */
@@ -314,6 +316,31 @@ static const struct drm_i915_mocs_entry dg1_mocs_table[] = 
{
MOCS_ENTRY(63, 0, L3_1_UC),
 };
 
+static const struct drm_i915_mocs_entry xehpsdv_mocs_table[] = {
+   /* wa_1608975824 */
+   MOCS_ENTRY(0, 0, L3_3_WB | L3_LKUP(1)),
+
+   /* UC - Coherent; GO:L3 */
+   MOCS_ENTRY(1, 0, L3_1_UC | L3_LKUP(1)),
+   /* UC - Coherent; GO:Memory */
+   MOCS_ENTRY(2, 0, L3_1_UC | L3_GLBGO(1) | L3_LKUP(1)),
+   /* UC - Non-Coherent; GO:Memory */
+   MOCS_ENTRY(3, 0, L3_1_UC | L3_GLBGO(1)),
+   /* UC - Non-Coherent; GO:L3 */
+   MOCS_ENTRY(4, 0, L3_1_UC),
+
+   /* WB */
+   MOCS_ENTRY(5, 0, L3_3_WB | L3_LKUP(1)),
+
+   /* HW Reserved - SW program but never use. */
+   MOCS_ENTRY(48, 0, L3_3_WB | L3_LKUP(1)),
+   MOCS_ENTRY(49, 0, L3_1_UC | L3_LKUP(1)),
+   MOCS_ENTRY(60, 0, L3_1_UC),
+   MOCS_ENTRY(61, 0, L3_1_UC),
+   MOCS_ENTRY(62, 0, L3_1_UC),
+   MOCS_ENTRY(63, 0, L3_1_UC),
+};
+
 enum {
HAS_GLOBAL_MOCS = BIT(0),
HAS_ENGINE_MOCS = BIT(1),
@@ -340,7 +367,11 @@ static unsigned int get_mocs_settings(const struct 
drm_i915_private *i915,
 {
unsigned int flags;
 
-   if (IS_DG1(i915)) {
+   if (IS_XEHPSDV(i915)) {
+   table->size = ARRAY_SIZE(xehpsdv_mocs_table);
+   table->table = xehpsdv_mocs_table;
+   table->n_entries = GEN9_NUM_MOCS_ENTRIES;
+   } else if (IS_DG1(i915)) {
table->size = ARRAY_SIZE(dg1_mocs_table);
table->table = dg1_mocs_table;
table->n_entries = GEN9_NUM_MOCS_ENTRIES;
-- 
2.25.4



[PATCH 23/53] drm/i915/xehpsdv: Read correct RP_STATE_CAP register

2021-07-01 Thread Matt Roper
The RP_STATE_CAP register is no longer part of the MCHBAR on XEHPSDV; this
register is now a per-tile register at GTTMMADDR offset 0x250014.

Cc: Rodrigo Vivi 
Signed-off-by: Matt Roper 
Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/i915/gt/intel_rps.c | 4 +++-
 drivers/gpu/drm/i915/i915_reg.h | 1 +
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c 
b/drivers/gpu/drm/i915/gt/intel_rps.c
index 490bc1513480..8e7b70248392 100644
--- a/drivers/gpu/drm/i915/gt/intel_rps.c
+++ b/drivers/gpu/drm/i915/gt/intel_rps.c
@@ -1937,7 +1937,9 @@ u32 intel_rps_read_state_cap(struct intel_rps *rps)
struct drm_i915_private *i915 = rps_to_i915(rps);
struct intel_uncore *uncore = rps_to_uncore(rps);
 
-   if (IS_GEN9_LP(i915))
+   if (IS_XEHPSDV(i915))
+   return intel_uncore_read(uncore, XEHPSDV_RP_STATE_CAP);
+   else if (IS_GEN9_LP(i915))
return intel_uncore_read(uncore, BXT_RP_STATE_CAP);
else
return intel_uncore_read(uncore, GEN6_RP_STATE_CAP);
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 0231f42226db..2992e8585399 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -4110,6 +4110,7 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
 #define GEN6_RP_STATE_CAP  _MMIO(MCHBAR_MIRROR_BASE_SNB + 0x5998)
 #define BXT_RP_STATE_CAP_MMIO(0x138170)
 #define GEN9_RP_STATE_LIMITS   _MMIO(0x138148)
+#define XEHPSDV_RP_STATE_CAP   _MMIO(0x250014)
 
 /*
  * Logical Context regs
-- 
2.25.4



[PATCH 38/53] drm/i915/dg2: Add dbuf programming

2021-07-01 Thread Matt Roper
DG2 extends our DDB to four DBuf slices; pipes A+B only have access to
the first two slices, whereas pipes C+D only have access to the second
two.

Confusingly, our bspec decided to switch from 1-based numbering
of dbuf slices (S1, S2) to 0-based numbering (S0, S1, S2, S3) in
Display13.  At the moment we're using the 0-based number scheme for the
DBUF_CTL_S() register addressing, but the 1-based number scheme in the
actual slice assignment tables.  We may want to consider switching the
assignment over to 0-based numbering too at some point...

Bspec: 49255
Bspec: 50057
Cc: Stanislav Lisovskiy 
Signed-off-by: Matt Roper 
---
 .../drm/i915/display/intel_display_power.h|   4 +
 drivers/gpu/drm/i915/intel_pm.c   | 120 +-
 2 files changed, 123 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/display/intel_display_power.h 
b/drivers/gpu/drm/i915/display/intel_display_power.h
index 22367b5cba96..ad788bbd727d 100644
--- a/drivers/gpu/drm/i915/display/intel_display_power.h
+++ b/drivers/gpu/drm/i915/display/intel_display_power.h
@@ -392,6 +392,10 @@ intel_display_power_put_all_in_set(struct drm_i915_private 
*i915,
intel_display_power_put_mask_in_set(i915, power_domain_set, 
power_domain_set->mask);
 }
 
+/*
+ * FIXME: We should probably switch this to a 0-based scheme to be consistent
+ * with how we now name/number DBUF_CTL instances.
+ */
 enum dbuf_slice {
DBUF_S1,
DBUF_S2,
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 5fdb96e7d266..ff8d89fff502 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -4584,6 +4584,117 @@ static const struct dbuf_slice_conf_entry 
tgl_allowed_dbufs[] =
{}
 };
 
+static const struct dbuf_slice_conf_entry dg2_allowed_dbufs[] = {
+   {
+   .active_pipes = BIT(PIPE_A),
+   .dbuf_mask = {
+   [PIPE_A] = BIT(DBUF_S1) | BIT(DBUF_S2),
+   },
+   },
+   {
+   .active_pipes = BIT(PIPE_B),
+   .dbuf_mask = {
+   [PIPE_B] = BIT(DBUF_S1) | BIT(DBUF_S2),
+   },
+   },
+   {
+   .active_pipes = BIT(PIPE_A) | BIT(PIPE_B),
+   .dbuf_mask = {
+   [PIPE_A] = BIT(DBUF_S1),
+   [PIPE_B] = BIT(DBUF_S2),
+   },
+   },
+   {
+   .active_pipes = BIT(PIPE_C),
+   .dbuf_mask = {
+   [PIPE_C] = BIT(DBUF_S3) | BIT(DBUF_S4),
+   },
+   },
+   {
+   .active_pipes = BIT(PIPE_A) | BIT(PIPE_C),
+   .dbuf_mask = {
+   [PIPE_A] = BIT(DBUF_S1) | BIT(DBUF_S2),
+   [PIPE_C] = BIT(DBUF_S3) | BIT(DBUF_S4),
+   },
+   },
+   {
+   .active_pipes = BIT(PIPE_B) | BIT(PIPE_C),
+   .dbuf_mask = {
+   [PIPE_B] = BIT(DBUF_S1) | BIT(DBUF_S2),
+   [PIPE_C] = BIT(DBUF_S3) | BIT(DBUF_S4),
+   },
+   },
+   {
+   .active_pipes = BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_C),
+   .dbuf_mask = {
+   [PIPE_A] = BIT(DBUF_S1),
+   [PIPE_B] = BIT(DBUF_S2),
+   [PIPE_C] = BIT(DBUF_S3) | BIT(DBUF_S4),
+   },
+   },
+   {
+   .active_pipes = BIT(PIPE_D),
+   .dbuf_mask = {
+   [PIPE_D] = BIT(DBUF_S3) | BIT(DBUF_S4),
+   },
+   },
+   {
+   .active_pipes = BIT(PIPE_A) | BIT(PIPE_D),
+   .dbuf_mask = {
+   [PIPE_A] = BIT(DBUF_S1) | BIT(DBUF_S2),
+   [PIPE_D] = BIT(DBUF_S3) | BIT(DBUF_S4),
+   },
+   },
+   {
+   .active_pipes = BIT(PIPE_B) | BIT(PIPE_D),
+   .dbuf_mask = {
+   [PIPE_B] = BIT(DBUF_S1) | BIT(DBUF_S2),
+   [PIPE_D] = BIT(DBUF_S3) | BIT(DBUF_S4),
+   },
+   },
+   {
+   .active_pipes = BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_D),
+   .dbuf_mask = {
+   [PIPE_A] = BIT(DBUF_S1),
+   [PIPE_B] = BIT(DBUF_S2),
+   [PIPE_D] = BIT(DBUF_S3) | BIT(DBUF_S4),
+   },
+   },
+   {
+   .active_pipes = BIT(PIPE_C) | BIT(PIPE_D),
+   .dbuf_mask = {
+   [PIPE_C] = BIT(DBUF_S3),
+   [PIPE_D] = BIT(DBUF_S4),
+   },
+   },
+   {
+   .active_pipes = BIT(PIPE_A) | BIT(PIPE_C) | BIT(PIPE_D),
+   .dbuf_mask = {
+   [PIPE_A] = BIT(DBUF_S1) | BIT(DBUF_S2),
+   [PIPE_C] = BIT(DBUF_S3),
+   [PIPE_D] = BIT(DBUF_S4),
+   },
+   },
+   {
+   .active_pipes = 

[PATCH 39/53] drm/i915/dg2: Don't program BW_BUDDY registers

2021-07-01 Thread Matt Roper
Although the BW_BUDDY registers still exist, they are not used for
anything on DG2.  This change is expected to hold true for future dgpu's
too.

Bspec: 49218
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/display/intel_display_power.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/i915/display/intel_display_power.c 
b/drivers/gpu/drm/i915/display/intel_display_power.c
index c34ff0947b85..df6358638fee 100644
--- a/drivers/gpu/drm/i915/display/intel_display_power.c
+++ b/drivers/gpu/drm/i915/display/intel_display_power.c
@@ -5814,6 +5814,10 @@ static void tgl_bw_buddy_init(struct drm_i915_private 
*dev_priv)
unsigned long abox_mask = INTEL_INFO(dev_priv)->abox_mask;
int config, i;
 
+   /* BW_BUDDY registers are not used on dgpu's beyond DG1 */
+   if (IS_DGFX(dev_priv) && !IS_DG1(dev_priv))
+   return;
+
if (IS_ALDERLAKE_S(dev_priv) ||
IS_DG1_REVID(dev_priv, DG1_REVID_A0, DG1_REVID_A0) ||
IS_TGL_DISPLAY_STEP(dev_priv, STEP_A0, STEP_B0))
-- 
2.25.4



[PATCH 30/53] drm/i915/dg2: Maintain backward-compatible nested batch behavior

2021-07-01 Thread Matt Roper
For tgl+, the per-context setting of MI_MODE[12] determines whether
the bits of a nested MI_BATCH_BUFFER_START instruction should be
interpreted in the traditional manner or whether they should
instead use a new tgl+ meaning that breaks backward compatibility, but
allows nesting into 3rd-level batchbuffers.  For previous platforms,
the hardware default for this register bit is to maintain
backward-compatible behavior unless a context intentionally opts into
the new behavior; however Xe_HPG flips the hardware default behavior.

>From a SW perspective, we want to maintain the backward-compatible
behavior for userspace, so we'll apply a fake workaround to set it back
to the legacy behavior on platforms where the hardware default is to
break compatibility.  At the moment there is no Linux userspace that
utilizes third-level batchbuffers, so this will avoid userspace from
needing to make any changes.  using the legacy meaning is the correct
thing to do.  If/when we have userspace consumers that want to utilize
third-level batch nesting, we can provide a context parameter to allow
them to opt-in.

Bspec: 45974, 45718
Cc: John Harrison 
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/gt/intel_workarounds.c | 39 +++--
 drivers/gpu/drm/i915/i915_reg.h |  1 +
 2 files changed, 38 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c 
b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index f97ff2848122..43db766b0672 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -686,6 +686,37 @@ static void dg1_ctx_workarounds_init(struct 
intel_engine_cs *engine,
 DG1_HZ_READ_SUPPRESSION_OPTIMIZATION_DISABLE);
 }
 
+static void fakewa_disable_nestedbb_mode(struct intel_engine_cs *engine,
+struct i915_wa_list *wal)
+{
+   /*
+* This is a "fake" workaround defined by software to ensure we
+* maintain reliable, backward-compatible behavior for userspace with
+* regards to how nested MI_BATCH_BUFFER_START commands are handled.
+*
+* The per-context setting of MI_MODE[12] determines whether the bits
+* of a nested MI_BATCH_BUFFER_START instruction should be interpreted
+* in the traditional manner or whether they should instead use a new
+* tgl+ meaning that breaks backward compatibility, but allows nesting
+* into 3rd-level batchbuffers.  When this new capability was first
+* added in TGL, it remained off by default unless a context
+* intentionally opted in to the new behavior.  However Xe_HPG now
+* flips this on by default and requires that we explicitly opt out if
+* we don't want the new behavior.
+*
+* From a SW perspective, we want to maintain the backward-compatible
+* behavior for userspace, so we'll apply a fake workaround to set it
+* back to the legacy behavior on platforms where the hardware default
+* is to break compatibility.  At the moment there is no Linux
+* userspace that utilizes third-level batchbuffers, so this will avoid
+* userspace from needing to make any changes.  using the legacy
+* meaning is the correct thing to do.  If/when we have userspace
+* consumers that want to utilize third-level batch nesting, we can
+* provide a context parameter to allow them to opt-in.
+*/
+   wa_masked_dis(wal, RING_MI_MODE(engine->mmio_base), TGL_NESTED_BB_EN);
+}
+
 static void
 __intel_engine_init_ctx_wa(struct intel_engine_cs *engine,
   struct i915_wa_list *wal,
@@ -693,11 +724,15 @@ __intel_engine_init_ctx_wa(struct intel_engine_cs *engine,
 {
struct drm_i915_private *i915 = engine->i915;
 
+   wa_init_start(wal, name, engine->name);
+
+   /* Applies to all engines */
+   if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 55))
+   fakewa_disable_nestedbb_mode(engine, wal);
+
if (engine->class != RENDER_CLASS)
return;
 
-   wa_init_start(wal, name, engine->name);
-
if (IS_DG1(i915))
dg1_ctx_workarounds_init(engine, wal);
else if (GRAPHICS_VER(i915) == 12)
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index b19d102e0a01..35a42df1f2aa 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -2821,6 +2821,7 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
 #define MI_MODE_MMIO(0x209c)
 # define VS_TIMER_DISPATCH (1 << 6)
 # define MI_FLUSH_ENABLE   (1 << 12)
+# define TGL_NESTED_BB_EN  (1 << 12)
 # define ASYNC_FLIP_PERF_DISABLE   (1 << 14)
 # define MODE_IDLE (1 << 9)
 # define STOP_RING  

[PATCH 26/53] drm/i915/dg2: Add forcewake table

2021-07-01 Thread Matt Roper
The DG2 forcewake table is very similar to the one used by XeHP SDV (and
both platforms are even presented as a single table in the bspec).  For
the most part DG2 starts using a few additional ranges that were
'reserved' on XeHP SDV and stops using some others.  However there is a
single range (0xd800-0xd87f) that needs to be handled differently
between the two platforms (it needs GT wake on XeHP SDV, but render wake
on DG2) so unless we want to wake both domains (which could waste power)
or define new types of forcewake domains for this special case we need
to have separate tables for the two platforms.  Let's define the ranges
for both platforms with a parameterized macro so that we don't actually
need to duplicate everything in the code.

It should be fine for DG2 to re-use the Xe_HP shadow register list so we
can continue to use the 'xehpsdv' MMIO write functions and don't need to
spin up a separate DG2 instance.

Bspec: 66534
Cc: Daniele Ceraolo Spurio 
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/intel_uncore.c | 305 +++-
 1 file changed, 168 insertions(+), 137 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_uncore.c 
b/drivers/gpu/drm/i915/intel_uncore.c
index 676b0052f01e..0c35acfcd6da 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -1317,143 +1317,170 @@ static const struct intel_forcewake_range 
__gen12_fw_ranges[] = {
0x1d3f00 - 0x1d3fff: VD2 */
 };
 
-/* *Must* be sorted by offset ranges! See intel_fw_table_check(). */
-static const struct intel_forcewake_range __xehp_fw_ranges[] = {
-   GEN_FW_RANGE(0x0, 0x1fff, 0), /*
- 0x0 -  0xaff: reserved
-   0xb00 - 0x1fff: always on */
-   GEN_FW_RANGE(0x2000, 0x26ff, FORCEWAKE_RENDER),
-   GEN_FW_RANGE(0x2700, 0x4aff, FORCEWAKE_GT),
-   GEN_FW_RANGE(0x4b00, 0x51ff, 0), /*
-   0x4b00 - 0x4fff: reserved
-   0x5000 - 0x51ff: always on */
-   GEN_FW_RANGE(0x5200, 0x7fff, FORCEWAKE_RENDER),
-   GEN_FW_RANGE(0x8000, 0x813f, FORCEWAKE_GT),
-   GEN_FW_RANGE(0x8140, 0x815f, FORCEWAKE_RENDER),
-   GEN_FW_RANGE(0x8160, 0x81ff, 0), /*
-   0x8160 - 0x817f: reserved
-   0x8180 - 0x81ff: always on */
-   GEN_FW_RANGE(0x8200, 0x82ff, FORCEWAKE_GT),
-   GEN_FW_RANGE(0x8300, 0x84ff, FORCEWAKE_RENDER),
-   GEN_FW_RANGE(0x8500, 0x94cf, FORCEWAKE_GT), /*
-   0x8500 - 0x87ff: gt
-   0x8800 - 0x8fff: reserved
-   0x9000 - 0x947f: gt
-   0x9480 - 0x94cf: reserved */
-   GEN_FW_RANGE(0x94d0, 0x955f, FORCEWAKE_RENDER),
-   GEN_FW_RANGE(0x9560, 0x97ff, 0), /*
-   0x9560 - 0x95ff: always on
-   0x9600 - 0x97ff: reserved */
-   GEN_FW_RANGE(0x9800, 0xcfff, FORCEWAKE_GT), /*
-   0x9800 - 0xb4ff: gt
-   0xb500 - 0xbfff: reserved
-   0xc000 - 0xcfff: gt */
-   GEN_FW_RANGE(0xd000, 0xd7ff, 0),
-   GEN_FW_RANGE(0xd800, 0xdbff, FORCEWAKE_GT),
-   GEN_FW_RANGE(0xdc00, 0xdcff, FORCEWAKE_RENDER),
-   GEN_FW_RANGE(0xdd00, 0xde7f, FORCEWAKE_GT), /*
-   0xdd00 - 0xddff: gt
-   0xde00 - 0xde7f: reserved */
-   GEN_FW_RANGE(0xde80, 0xe8ff, FORCEWAKE_RENDER), /*
-   0xde80 - 0xdfff: render
-   0xe000 - 0xe0ff: reserved
-   0xe100 - 0xe8ff: render */
-   GEN_FW_RANGE(0xe900, 0x, FORCEWAKE_GT), /*
-   0xe900 - 0xe9ff: gt
-   0xea00 - 0xefff: reserved
-   0xf000 - 0x: gt */
-   GEN_FW_RANGE(0x1, 0x13fff, 0), /*
-   0x1 - 0x11fff: reserved
-   0x12000 - 0x127ff: always on
-   0x12800 - 0x13fff: reserved */
-   GEN_FW_RANGE(0x14000, 0x141ff, FORCEWAKE_MEDIA_VDBOX0),
-   GEN_FW_RANGE(0x14200, 0x143ff, FORCEWAKE_MEDIA_VDBOX2),
-   GEN_FW_RANGE(0x14400, 0x145ff, FORCEWAKE_MEDIA_VDBOX4),
-   GEN_FW_RANGE(0x14600, 0x147ff, FORCEWAKE_MEDIA_VDBOX6),
-   GEN_FW_RANGE(0x14800, 0x1, FORCEWAKE_RENDER), /*
-   0x14800 - 0x14fff: render
-   0x15000 - 0x16dff: reserved
-   0x16e00 - 0x1: render */
-   GEN_FW_RANGE(0x2, 0x21fff, FORCEWAKE_MEDIA_VDBOX0), /*
-   0x2 - 0x20fff: VD0
-   0x21000 - 0x21fff: reserved */
-   GEN_FW_RANGE(0x22000, 0x23fff, FORCEWAKE_GT),
-   GEN_FW_RANGE(0x24000, 0x2417f, 0), /*
-   0x24000 - 0x2407f: always on
-   0x24080 - 0x2417f: reserved */
-   GEN_FW_RANGE(0x24180, 0x249ff, FORCEWAKE_GT), /*
-   0x24180 - 0x241ff: gt
-   0x24200 - 0x249ff: reserved */
-   GEN_FW_RANGE(0x24a00, 0x251ff, FORCEWAKE_RENDER), /*
-   0x24a00 - 0x24a7f: render
-   0x24a80 - 0x251ff: reserved */
-   GEN_FW_RANGE(0x25200, 0x25fff, FORCEWAKE_GT), /*
-   0x25200 - 0x252ff: gt
-   0x25300 - 

[PATCH 41/53] drm/i915/dg2: DG2 has fixed memory bandwidth

2021-07-01 Thread Matt Roper
DG2 doesn't have a SAGV or QGV points that determine memory bandwidth.
Instead it has a constant amount of memory bandwidth available to
display that does not need to be reduced based on the number of active
planes.

For simplicity, we'll just modify driver initialization to create a
single dummy QGV point with the proper amount of memory bandwidth,
rather than trying to query the pcode for this information.

Bspec: 64631
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/display/intel_bw.c | 24 +++-
 1 file changed, 23 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/display/intel_bw.c 
b/drivers/gpu/drm/i915/display/intel_bw.c
index bfb398f0432e..4ca83874d0aa 100644
--- a/drivers/gpu/drm/i915/display/intel_bw.c
+++ b/drivers/gpu/drm/i915/display/intel_bw.c
@@ -234,6 +234,26 @@ static int icl_get_bw_info(struct drm_i915_private 
*dev_priv, const struct intel
return 0;
 }
 
+static void dg2_get_bw_info(struct drm_i915_private *i915)
+{
+   struct intel_bw_info *bi = >max_bw[0];
+
+   /*
+* DG2 doesn't have SAGV or QGV points, just a constant max bandwidth
+* that doesn't depend on the number of planes enabled.  Create a
+* single dummy QGV point to reflect that.  DG2-G10 platforms have a
+* constant 50 GB/s bandwidth, whereas DG2-G11 platforms have 38 GB/s.
+*/
+   bi->num_planes = 1;
+   bi->num_qgv_points = 1;
+   if (IS_DG2_G11(i915))
+   bi->deratedbw[0] = 38000;
+   else
+   bi->deratedbw[0] = 5;
+
+   i915->sagv_status = I915_SAGV_NOT_CONTROLLED;
+}
+
 static unsigned int icl_max_bw(struct drm_i915_private *dev_priv,
   int num_planes, int qgv_point)
 {
@@ -267,7 +287,9 @@ void intel_bw_init_hw(struct drm_i915_private *dev_priv)
if (!HAS_DISPLAY(dev_priv))
return;
 
-   if (IS_ALDERLAKE_S(dev_priv) || IS_ALDERLAKE_P(dev_priv))
+   if (IS_DG2(dev_priv))
+   dg2_get_bw_info(dev_priv);
+   else if (IS_ALDERLAKE_S(dev_priv) || IS_ALDERLAKE_P(dev_priv))
icl_get_bw_info(dev_priv, _sa_info);
else if (IS_ROCKETLAKE(dev_priv))
icl_get_bw_info(dev_priv, _sa_info);
-- 
2.25.4



[PATCH 06/53] drm/i915/selftests: Allow for larger engine counts

2021-07-01 Thread Matt Roper
From: John Harrison 

Increasing the engine count causes a couple of local array variables
to exceed the kernel stack limit. So make them dynamic allocations
instead.

Signed-off-by: John Harrison 
Signed-off-by: Daniele Ceraolo Spurio 
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/gt/selftest_execlists.c  | 10 --
 .../gpu/drm/i915/gt/selftest_workarounds.c| 32 ---
 2 files changed, 29 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/selftest_execlists.c 
b/drivers/gpu/drm/i915/gt/selftest_execlists.c
index 08896ae027d5..1e7fe479 100644
--- a/drivers/gpu/drm/i915/gt/selftest_execlists.c
+++ b/drivers/gpu/drm/i915/gt/selftest_execlists.c
@@ -3561,12 +3561,16 @@ static int smoke_crescendo(struct preempt_smoke *smoke, 
unsigned int flags)
 #define BATCH BIT(0)
 {
struct task_struct *tsk[I915_NUM_ENGINES] = {};
-   struct preempt_smoke arg[I915_NUM_ENGINES];
+   struct preempt_smoke *arg;
struct intel_engine_cs *engine;
enum intel_engine_id id;
unsigned long count;
int err = 0;
 
+   arg = kmalloc_array(I915_NUM_ENGINES, sizeof(*arg), GFP_KERNEL);
+   if (!arg)
+   return -ENOMEM;
+
for_each_engine(engine, smoke->gt, id) {
arg[id] = *smoke;
arg[id].engine = engine;
@@ -3574,7 +3578,7 @@ static int smoke_crescendo(struct preempt_smoke *smoke, 
unsigned int flags)
arg[id].batch = NULL;
arg[id].count = 0;
 
-   tsk[id] = kthread_run(smoke_crescendo_thread, ,
+   tsk[id] = kthread_run(smoke_crescendo_thread, arg,
  "igt/smoke:%d", id);
if (IS_ERR(tsk[id])) {
err = PTR_ERR(tsk[id]);
@@ -3603,6 +3607,8 @@ static int smoke_crescendo(struct preempt_smoke *smoke, 
unsigned int flags)
 
pr_info("Submitted %lu crescendo:%x requests across %d engines and %d 
contexts\n",
count, flags, smoke->gt->info.num_engines, smoke->ncontext);
+
+   kfree(arg);
return 0;
 }
 
diff --git a/drivers/gpu/drm/i915/gt/selftest_workarounds.c 
b/drivers/gpu/drm/i915/gt/selftest_workarounds.c
index 7ebc4edb8ecf..7a38ce40feb2 100644
--- a/drivers/gpu/drm/i915/gt/selftest_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/selftest_workarounds.c
@@ -1175,31 +1175,36 @@ live_gpu_reset_workarounds(void *arg)
 {
struct intel_gt *gt = arg;
intel_wakeref_t wakeref;
-   struct wa_lists lists;
+   struct wa_lists *lists;
bool ok;
 
if (!intel_has_gpu_reset(gt))
return 0;
 
+   lists = kzalloc(sizeof(*lists), GFP_KERNEL);
+   if (!lists)
+   return -ENOMEM;
+
pr_info("Verifying after GPU reset...\n");
 
igt_global_reset_lock(gt);
wakeref = intel_runtime_pm_get(gt->uncore->rpm);
 
-   reference_lists_init(gt, );
+   reference_lists_init(gt, lists);
 
-   ok = verify_wa_lists(gt, , "before reset");
+   ok = verify_wa_lists(gt, lists, "before reset");
if (!ok)
goto out;
 
intel_gt_reset(gt, ALL_ENGINES, "live_workarounds");
 
-   ok = verify_wa_lists(gt, , "after reset");
+   ok = verify_wa_lists(gt, lists, "after reset");
 
 out:
-   reference_lists_fini(gt, );
+   reference_lists_fini(gt, lists);
intel_runtime_pm_put(gt->uncore->rpm, wakeref);
igt_global_reset_unlock(gt);
+   kfree(lists);
 
return ok ? 0 : -ESRCH;
 }
@@ -1214,16 +1219,20 @@ live_engine_reset_workarounds(void *arg)
struct igt_spinner spin;
struct i915_request *rq;
intel_wakeref_t wakeref;
-   struct wa_lists lists;
+   struct wa_lists *lists;
int ret = 0;
 
if (!intel_has_reset_engine(gt))
return 0;
 
+   lists = kzalloc(sizeof(*lists), GFP_KERNEL);
+   if (!lists)
+   return -ENOMEM;
+
igt_global_reset_lock(gt);
wakeref = intel_runtime_pm_get(gt->uncore->rpm);
 
-   reference_lists_init(gt, );
+   reference_lists_init(gt, lists);
 
for_each_engine(engine, gt, id) {
bool ok;
@@ -1235,7 +1244,7 @@ live_engine_reset_workarounds(void *arg)
break;
}
 
-   ok = verify_wa_lists(gt, , "before reset");
+   ok = verify_wa_lists(gt, lists, "before reset");
if (!ok) {
ret = -ESRCH;
goto err;
@@ -1247,7 +1256,7 @@ live_engine_reset_workarounds(void *arg)
goto err;
}
 
-   ok = verify_wa_lists(gt, , "after idle reset");
+   ok = verify_wa_lists(gt, lists, "after idle reset");
if (!ok) {
ret = -ESRCH;
goto err;
@@ -1282,7 +1291,7 @@ live_engine_reset_workarounds(void *arg)
igt_spinner_end();

[PATCH 37/53] drm/i915/dg2: Setup display outputs

2021-07-01 Thread Matt Roper
DG2 has outputs on DDI A-D attached to what the bspec diagram shows as
"Combo PHY A-D."  Note that despite being labelled "combo" the PHYs on
these outputs are Synopsys PHYs rather than traditional Intel combo PHY
technology.

Cc: Anusha Srivatsa 
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/display/intel_display.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/display/intel_display.c 
b/drivers/gpu/drm/i915/display/intel_display.c
index c673d0c8fb4a..dc2b943a4e72 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -11329,7 +11329,12 @@ static void intel_setup_outputs(struct 
drm_i915_private *dev_priv)
if (!HAS_DISPLAY(dev_priv))
return;
 
-   if (IS_ALDERLAKE_P(dev_priv)) {
+   if (IS_DG2(dev_priv)) {
+   intel_ddi_init(dev_priv, PORT_A);
+   intel_ddi_init(dev_priv, PORT_B);
+   intel_ddi_init(dev_priv, PORT_C);
+   intel_ddi_init(dev_priv, PORT_D_XELPD);
+   } else if (IS_ALDERLAKE_P(dev_priv)) {
intel_ddi_init(dev_priv, PORT_A);
intel_ddi_init(dev_priv, PORT_B);
intel_ddi_init(dev_priv, PORT_TC1);
-- 
2.25.4



[PATCH 31/53] drm/i915/dg2: Report INSTDONE_GEOM values in error state

2021-07-01 Thread Matt Roper
Xe_HPG adds some additional INSTDONE_GEOM debug registers; the Mesa team
has indicated that having these reported in the error state would be
useful for debugging GPU hangs.  These registers are replicated per-DSS
with gslice steering.

Cc: Lionel Landwerlin 
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c|  7 +++
 drivers/gpu/drm/i915/gt/intel_engine_types.h |  3 +++
 drivers/gpu/drm/i915/i915_gpu_error.c| 10 --
 drivers/gpu/drm/i915/i915_reg.h  |  1 +
 4 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index e1302e9c168b..b3c002e4ae9f 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -1220,6 +1220,13 @@ void intel_engine_get_instdone(const struct 
intel_engine_cs *engine,
  GEN7_ROW_INSTDONE);
}
}
+
+   if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 55)) {
+   for_each_instdone_gslice_dss_xehp(i915, sseu, iter, 
slice, subslice)
+   instdone->geom_svg[slice][subslice] =
+   read_subslice_reg(engine, slice, 
subslice,
+ 
XEHPG_INSTDONE_GEOM_SVG);
+   }
} else if (GRAPHICS_VER(i915) >= 7) {
instdone->instdone =
intel_uncore_read(uncore, RING_INSTDONE(mmio_base));
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h 
b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index e917b7519f2b..93609d797ac2 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -80,6 +80,9 @@ struct intel_instdone {
u32 slice_common_extra[2];
u32 sampler[GEN_MAX_GSLICES][I915_MAX_SUBSLICES];
u32 row[GEN_MAX_GSLICES][I915_MAX_SUBSLICES];
+
+   /* Added in XeHPG */
+   u32 geom_svg[GEN_MAX_GSLICES][I915_MAX_SUBSLICES];
 };
 
 /*
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c 
b/drivers/gpu/drm/i915/i915_gpu_error.c
index c1e744b5ab47..4de7edc451ef 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -431,6 +431,7 @@ static void error_print_instdone(struct 
drm_i915_error_state_buf *m,
const struct sseu_dev_info *sseu = >engine->gt->info.sseu;
int slice;
int subslice;
+   int iter;
 
err_printf(m, "  INSTDONE: 0x%08x\n",
   ee->instdone.instdone);
@@ -445,8 +446,6 @@ static void error_print_instdone(struct 
drm_i915_error_state_buf *m,
return;
 
if (GRAPHICS_VER_FULL(m->i915) >= IP_VER(12, 50)) {
-   int iter;
-
for_each_instdone_gslice_dss_xehp(m->i915, sseu, iter, slice, 
subslice)
err_printf(m, "  SAMPLER_INSTDONE[%d][%d]: 0x%08x\n",
   slice, subslice,
@@ -471,6 +470,13 @@ static void error_print_instdone(struct 
drm_i915_error_state_buf *m,
if (GRAPHICS_VER(m->i915) < 12)
return;
 
+   if (GRAPHICS_VER_FULL(m->i915) >= IP_VER(12, 55)) {
+   for_each_instdone_gslice_dss_xehp(m->i915, sseu, iter, slice, 
subslice)
+   err_printf(m, "  GEOM_SVGUNIT_INSTDONE[%d][%d]: 
0x%08x\n",
+  slice, subslice,
+  ee->instdone.geom_svg[slice][subslice]);
+   }
+
err_printf(m, "  SC_INSTDONE_EXTRA: 0x%08x\n",
   ee->instdone.slice_common_extra[0]);
err_printf(m, "  SC_INSTDONE_EXTRA2: 0x%08x\n",
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 35a42df1f2aa..d58864c7adc6 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -2686,6 +2686,7 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
 #define GEN12_SC_INSTDONE_EXTRA2   _MMIO(0x7108)
 #define GEN7_SAMPLER_INSTDONE  _MMIO(0xe160)
 #define GEN7_ROW_INSTDONE  _MMIO(0xe164)
+#define XEHPG_INSTDONE_GEOM_SVG_MMIO(0x666c)
 #define MCFG_MCR_SELECTOR  _MMIO(0xfd0)
 #define SF_MCR_SELECTOR_MMIO(0xfd8)
 #define GEN8_MCR_SELECTOR  _MMIO(0xfdc)
-- 
2.25.4



[PATCH 14/53] drm/i915/xehp: handle new steering options

2021-07-01 Thread Matt Roper
From: Daniele Ceraolo Spurio 

Xe_HP is more modular then its predecessors and as a consequence it has
more types of replicated registers.  As with l3bank regions on previous
platforms, we may need to explicitly re-steer accesses to these new
types of ranges at runtime if we can't find a single default steering
value that satisfies the fusing of all types.

Bspec: 66534
Cc: Tvrtko Ursulin 
Signed-off-by: Daniele Ceraolo Spurio 
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/gt/intel_gt.c  | 40 -
 drivers/gpu/drm/i915/gt/intel_gt.h  |  1 +
 drivers/gpu/drm/i915/gt/intel_gt_types.h|  7 ++
 drivers/gpu/drm/i915/gt/intel_region_lmem.c |  1 +
 drivers/gpu/drm/i915/gt/intel_sseu.c| 18 +
 drivers/gpu/drm/i915/gt/intel_sseu.h|  6 ++
 drivers/gpu/drm/i915/gt/intel_workarounds.c | 89 +++--
 drivers/gpu/drm/i915/i915_drv.h |  3 +
 drivers/gpu/drm/i915/i915_pci.c |  1 +
 drivers/gpu/drm/i915/i915_reg.h |  4 +
 drivers/gpu/drm/i915/intel_device_info.h|  1 +
 11 files changed, 165 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c 
b/drivers/gpu/drm/i915/gt/intel_gt.c
index e714e21c0a4d..f59bcedbb80b 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
@@ -89,6 +89,13 @@ static const struct intel_mmio_range 
icl_l3bank_steering_table[] = {
{},
 };
 
+static u16 slicemask(struct intel_gt *gt, int count)
+{
+   u64 dss_mask = intel_sseu_get_subslices(>info.sseu, 0);
+
+   return intel_slicemask_from_dssmask(dss_mask, count);
+}
+
 int intel_gt_init_mmio(struct intel_gt *gt)
 {
intel_gt_init_clock_frequency(gt);
@@ -96,11 +103,24 @@ int intel_gt_init_mmio(struct intel_gt *gt)
intel_uc_init_mmio(>uc);
intel_sseu_info_init(gt);
 
-   if (GRAPHICS_VER(gt->i915) >= 11) {
+   /*
+* An mslice is unavailable only if both the meml3 for the slice is
+* disabled *and* all of the DSS in the slice (quadrant) are disabled.
+*/
+   if (HAS_MSLICES(gt->i915))
+   gt->info.mslice_mask =
+   slicemask(gt, GEN_DSS_PER_MSLICE) |
+   (intel_uncore_read(gt->uncore, GEN10_MIRROR_FUSE3) &
+GEN12_MEML3_EN_MASK);
+
+   if (GRAPHICS_VER(gt->i915) >= 11 &&
+  GRAPHICS_VER_FULL(gt->i915) < IP_VER(12, 50)) {
gt->steering_table[L3BANK] = icl_l3bank_steering_table;
gt->info.l3bank_mask =
~intel_uncore_read(gt->uncore, GEN10_MIRROR_FUSE3) &
GEN10_L3BANK_MASK;
+   } else if (HAS_MSLICES(gt->i915)) {
+   MISSING_CASE(INTEL_INFO(gt->i915)->platform);
}
 
return intel_engines_init_mmio(gt);
@@ -766,6 +786,24 @@ static void intel_gt_get_valid_steering(struct intel_gt 
*gt,
*sliceid = 0;   /* unused */
*subsliceid = __ffs(gt->info.l3bank_mask);
break;
+   case MSLICE:
+   GEM_DEBUG_WARN_ON(!gt->info.mslice_mask); /* should be 
impossible! */
+
+   *sliceid = __ffs(gt->info.mslice_mask);
+   *subsliceid = 0;/* unused */
+   break;
+   case LNCF:
+   GEM_DEBUG_WARN_ON(!gt->info.mslice_mask); /* should be 
impossible! */
+
+   /*
+* 0xFDC[29:28] selects the mslice to steer to and 0xFDC[27]
+* selects which LNCF within the mslice to steer to.  An LNCF
+* is always present if its mslice is present, so we can safely
+* just steer to LNCF 0 in all cases.
+*/
+   *sliceid = __ffs(gt->info.mslice_mask) << 1;
+   *subsliceid = 0;/* unused */
+   break;
default:
MISSING_CASE(type);
*sliceid = 0;
diff --git a/drivers/gpu/drm/i915/gt/intel_gt.h 
b/drivers/gpu/drm/i915/gt/intel_gt.h
index e7aabe0cc5bf..f9bcde31f697 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt.h
@@ -82,6 +82,7 @@ static inline bool intel_gt_needs_read_steering(struct 
intel_gt *gt,
 }
 
 u32 intel_gt_read_register_fw(struct intel_gt *gt, i915_reg_t reg);
+u32 intel_gt_read_register(struct intel_gt *gt, i915_reg_t reg);
 
 void intel_gt_info_print(const struct intel_gt_info *info,
 struct drm_printer *p);
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h 
b/drivers/gpu/drm/i915/gt/intel_gt_types.h
index d93d578a4105..b06d8eaf12ea 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h
@@ -47,9 +47,14 @@ struct intel_mmio_range {
  * of multicast registers.  If another type of steering does not have any
  * overlap in valid steering targets with 'subslice' style registers, we will
  * need to explicitly re-steer reads of registers of the other 

[PATCH 33/53] drm/i915/dg2: Add fake PCH

2021-07-01 Thread Matt Roper
As with DG1, DG2 has an ICL-style south display interface provided on
the same PCI device.  Add a fake PCH to ensure DG2 takes the appropriate
codepaths for south display handling.

Bspec: 54871, 50062, 49961, 53673
Cc: Lucas De Marchi 
Signed-off-by: Matt Roper 
Signed-off-by: Aditya Swarup 
Signed-off-by: José Roberto de Souza 
---
 drivers/gpu/drm/i915/i915_irq.c  | 2 +-
 drivers/gpu/drm/i915/intel_pch.c | 3 +++
 drivers/gpu/drm/i915/intel_pch.h | 2 ++
 3 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 9d47ffa39093..34a0d49e760e 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -208,7 +208,7 @@ static void intel_hpd_init_pins(struct drm_i915_private 
*dev_priv)
(!HAS_PCH_SPLIT(dev_priv) || HAS_PCH_NOP(dev_priv)))
return;
 
-   if (HAS_PCH_DG1(dev_priv))
+   if (INTEL_PCH_TYPE(dev_priv) >= PCH_DG1)
hpd->pch_hpd = hpd_sde_dg1;
else if (INTEL_PCH_TYPE(dev_priv) >= PCH_ICP)
hpd->pch_hpd = hpd_icp;
diff --git a/drivers/gpu/drm/i915/intel_pch.c b/drivers/gpu/drm/i915/intel_pch.c
index 4e92ae19189e..cc44164e242b 100644
--- a/drivers/gpu/drm/i915/intel_pch.c
+++ b/drivers/gpu/drm/i915/intel_pch.c
@@ -211,6 +211,9 @@ void intel_detect_pch(struct drm_i915_private *dev_priv)
if (IS_DG1(dev_priv)) {
dev_priv->pch_type = PCH_DG1;
return;
+   } else if (IS_DG2(dev_priv)) {
+   dev_priv->pch_type = PCH_DG2;
+   return;
}
 
/*
diff --git a/drivers/gpu/drm/i915/intel_pch.h b/drivers/gpu/drm/i915/intel_pch.h
index e2f3f30c6445..7c0d83d292dc 100644
--- a/drivers/gpu/drm/i915/intel_pch.h
+++ b/drivers/gpu/drm/i915/intel_pch.h
@@ -30,6 +30,7 @@ enum intel_pch {
 
/* Fake PCHs, functionality handled on the same PCI dev */
PCH_DG1 = 1024,
+   PCH_DG2,
 };
 
 #define INTEL_PCH_DEVICE_ID_MASK   0xff80
@@ -62,6 +63,7 @@ enum intel_pch {
 
 #define INTEL_PCH_TYPE(dev_priv)   ((dev_priv)->pch_type)
 #define INTEL_PCH_ID(dev_priv) ((dev_priv)->pch_id)
+#define HAS_PCH_DG2(dev_priv)  (INTEL_PCH_TYPE(dev_priv) == 
PCH_DG2)
 #define HAS_PCH_ADP(dev_priv)  (INTEL_PCH_TYPE(dev_priv) == 
PCH_ADP)
 #define HAS_PCH_DG1(dev_priv)  (INTEL_PCH_TYPE(dev_priv) == 
PCH_DG1)
 #define HAS_PCH_JSP(dev_priv)  (INTEL_PCH_TYPE(dev_priv) == 
PCH_JSP)
-- 
2.25.4



[PATCH 18/53] drm/i915/xehpsdv: Add maximum sseu limits

2021-07-01 Thread Matt Roper
Due to the removal of legacy slices and the transition to a
gslice/cslice/mslice/etc. design, we'll internally store all DSS under
"slice0."

Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/gt/intel_sseu.c | 5 -
 drivers/gpu/drm/i915/gt/intel_sseu.h | 2 +-
 drivers/gpu/drm/i915/gt/intel_sseu_debugfs.c | 2 +-
 3 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.c 
b/drivers/gpu/drm/i915/gt/intel_sseu.c
index 5d1b7d06c96b..16c0552fcd1d 100644
--- a/drivers/gpu/drm/i915/gt/intel_sseu.c
+++ b/drivers/gpu/drm/i915/gt/intel_sseu.c
@@ -145,7 +145,10 @@ static void gen12_sseu_info_init(struct intel_gt *gt)
 * across the entire device. Then calculate out the DSS for each
 * workload type within that software slice.
 */
-   intel_sseu_set_info(sseu, 1, 6, 16);
+   if (IS_XEHPSDV(gt->i915))
+   intel_sseu_set_info(sseu, 1, 32, 16);
+   else
+   intel_sseu_set_info(sseu, 1, 6, 16);
 
/*
 * As mentioned above, Xe_HP does not have the concept of a slice.
diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.h 
b/drivers/gpu/drm/i915/gt/intel_sseu.h
index 74487650b08f..204ea6709460 100644
--- a/drivers/gpu/drm/i915/gt/intel_sseu.h
+++ b/drivers/gpu/drm/i915/gt/intel_sseu.h
@@ -16,7 +16,7 @@ struct intel_gt;
 struct drm_printer;
 
 #define GEN_MAX_SLICES (6) /* CNL upper bound */
-#define GEN_MAX_SUBSLICES  (8) /* ICL upper bound */
+#define GEN_MAX_SUBSLICES  (32) /* XEHPSDV upper bound */
 #define GEN_SSEU_STRIDE(max_entries) DIV_ROUND_UP(max_entries, BITS_PER_BYTE)
 #define GEN_MAX_SUBSLICE_STRIDE GEN_SSEU_STRIDE(GEN_MAX_SUBSLICES)
 #define GEN_MAX_EUS(16) /* TGL upper bound */
diff --git a/drivers/gpu/drm/i915/gt/intel_sseu_debugfs.c 
b/drivers/gpu/drm/i915/gt/intel_sseu_debugfs.c
index 714fe8495775..a424150b052e 100644
--- a/drivers/gpu/drm/i915/gt/intel_sseu_debugfs.c
+++ b/drivers/gpu/drm/i915/gt/intel_sseu_debugfs.c
@@ -53,7 +53,7 @@ static void cherryview_sseu_device_status(struct intel_gt *gt,
 static void gen10_sseu_device_status(struct intel_gt *gt,
 struct sseu_dev_info *sseu)
 {
-#define SS_MAX 6
+#define SS_MAX 8
struct intel_uncore *uncore = gt->uncore;
const struct intel_gt_info *info = >info;
u32 s_reg[SS_MAX], eu_reg[2 * SS_MAX], eu_mask[2];
-- 
2.25.4



[PATCH 03/53] drm/i915: Fork DG1 interrupt handler

2021-07-01 Thread Matt Roper
From: Paulo Zanoni 

The current interrupt handler is getting increasingly complicated and
Xe_HP changes will bring even more complexity.  Let's split off a new
interrupt handler starting with DG1 (i.e., when the master tile
interrupt register was added to the design) and use that as the basis
for the new Xe_HP changes.

Now that we track the hardware IP's release number as well as the
version number, we can also properly define DG1 has version "12.10" and
replace the has_master_unit_irq feature flag with an IP version test.

Bspec: 50875
Cc: Daniele Spurio Ceraolo 
Cc: Stuart Summers 
Signed-off-by: Paulo Zanoni 
Signed-off-by: Lucas De Marchi 
Signed-off-by: Tomasz Lis 
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/i915_drv.h  |   2 -
 drivers/gpu/drm/i915/i915_irq.c  | 139 +++
 drivers/gpu/drm/i915/i915_pci.c  |   2 +-
 drivers/gpu/drm/i915/i915_reg.h  |   4 +-
 drivers/gpu/drm/i915/intel_device_info.h |   1 -
 5 files changed, 95 insertions(+), 53 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 9639800485b9..519cce702f4b 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1601,8 +1601,6 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
 #define HAS_LOGICAL_RING_ELSQ(dev_priv) \
(INTEL_INFO(dev_priv)->has_logical_ring_elsq)
 
-#define HAS_MASTER_UNIT_IRQ(dev_priv) 
(INTEL_INFO(dev_priv)->has_master_unit_irq)
-
 #define HAS_EXECLISTS(dev_priv) HAS_LOGICAL_RING_CONTEXTS(dev_priv)
 
 #define INTEL_PPGTT(dev_priv) (INTEL_INFO(dev_priv)->ppgtt_type)
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 7d0ce8b9f8ed..9d47ffa39093 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -2699,11 +2699,9 @@ gen11_display_irq_handler(struct drm_i915_private *i915)
enable_rpm_wakeref_asserts(>runtime_pm);
 }
 
-static __always_inline irqreturn_t
-__gen11_irq_handler(struct drm_i915_private * const i915,
-   u32 (*intr_disable)(void __iomem * const regs),
-   void (*intr_enable)(void __iomem * const regs))
+static irqreturn_t gen11_irq_handler(int irq, void *arg)
 {
+   struct drm_i915_private *i915 = arg;
void __iomem * const regs = i915->uncore.regs;
struct intel_gt *gt = >gt;
u32 master_ctl;
@@ -2712,9 +2710,9 @@ __gen11_irq_handler(struct drm_i915_private * const i915,
if (!intel_irqs_enabled(i915))
return IRQ_NONE;
 
-   master_ctl = intr_disable(regs);
+   master_ctl = gen11_master_intr_disable(regs);
if (!master_ctl) {
-   intr_enable(regs);
+   gen11_master_intr_enable(regs);
return IRQ_NONE;
}
 
@@ -2727,7 +2725,7 @@ __gen11_irq_handler(struct drm_i915_private * const i915,
 
gu_misc_iir = gen11_gu_misc_irq_ack(gt, master_ctl);
 
-   intr_enable(regs);
+   gen11_master_intr_enable(regs);
 
gen11_gu_misc_irq_handler(gt, gu_misc_iir);
 
@@ -2736,51 +2734,69 @@ __gen11_irq_handler(struct drm_i915_private * const 
i915,
return IRQ_HANDLED;
 }
 
-static irqreturn_t gen11_irq_handler(int irq, void *arg)
-{
-   return __gen11_irq_handler(arg,
-  gen11_master_intr_disable,
-  gen11_master_intr_enable);
-}
-
-static u32 dg1_master_intr_disable_and_ack(void __iomem * const regs)
+static inline u32 dg1_master_intr_disable(void __iomem * const regs)
 {
u32 val;
 
/* First disable interrupts */
-   raw_reg_write(regs, DG1_MSTR_UNIT_INTR, 0);
+   raw_reg_write(regs, DG1_MSTR_TILE_INTR, 0);
 
/* Get the indication levels and ack the master unit */
-   val = raw_reg_read(regs, DG1_MSTR_UNIT_INTR);
+   val = raw_reg_read(regs, DG1_MSTR_TILE_INTR);
if (unlikely(!val))
return 0;
 
-   raw_reg_write(regs, DG1_MSTR_UNIT_INTR, val);
-
-   /*
-* Now with master disabled, get a sample of level indications
-* for this interrupt and ack them right away - we keep GEN11_MASTER_IRQ
-* out as this bit doesn't exist anymore for DG1
-*/
-   val = raw_reg_read(regs, GEN11_GFX_MSTR_IRQ) & ~GEN11_MASTER_IRQ;
-   if (unlikely(!val))
-   return 0;
-
-   raw_reg_write(regs, GEN11_GFX_MSTR_IRQ, val);
+   raw_reg_write(regs, DG1_MSTR_TILE_INTR, val);
 
return val;
 }
 
 static inline void dg1_master_intr_enable(void __iomem * const regs)
 {
-   raw_reg_write(regs, DG1_MSTR_UNIT_INTR, DG1_MSTR_IRQ);
+   raw_reg_write(regs, DG1_MSTR_TILE_INTR, DG1_MSTR_IRQ);
 }
 
 static irqreturn_t dg1_irq_handler(int irq, void *arg)
 {
-   return __gen11_irq_handler(arg,
-  dg1_master_intr_disable_and_ack,
-  dg1_master_intr_enable);
+   struct drm_i915_private * const 

[PATCH 15/53] drm/i915/xehp: Loop over all gslices for INSTDONE processing

2021-07-01 Thread Matt Roper
We no longer have traditional slices on Xe_HP platforms, but the
INSTDONE registers are replicated according to gslice representation
which is similar.  We can mostly re-use the existing instdone code with
just a few modifications:

 * Create an alternate instdone loop macro that will iterate over the
   flat DSS space, but still provide the gslice/dss steering values for
   compatibility with the legacy code.

 * We should allocate INSTDONE storage space according to the maximum
   number of gslices rather than the maximum number of legacy slices to
   ensure we have enough storage space to hold all of the values.  XeHP
   design has 8 gslices, whereas older platforms never had more than 3
   slices.

Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c| 48 +++-
 drivers/gpu/drm/i915/gt/intel_engine_types.h | 12 -
 drivers/gpu/drm/i915/gt/intel_sseu.h |  7 +++
 drivers/gpu/drm/i915/i915_gpu_error.c| 32 +
 4 files changed, 66 insertions(+), 33 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 6e2aa1acc4d4..e1302e9c168b 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -1181,16 +1181,16 @@ void intel_engine_get_instdone(const struct 
intel_engine_cs *engine,
u32 mmio_base = engine->mmio_base;
int slice;
int subslice;
+   int iter;
 
memset(instdone, 0, sizeof(*instdone));
 
-   switch (GRAPHICS_VER(i915)) {
-   default:
+   if (GRAPHICS_VER(i915) >= 8) {
instdone->instdone =
intel_uncore_read(uncore, RING_INSTDONE(mmio_base));
 
if (engine->id != RCS0)
-   break;
+   return;
 
instdone->slice_common =
intel_uncore_read(uncore, GEN7_SC_INSTDONE);
@@ -1200,21 +1200,32 @@ void intel_engine_get_instdone(const struct 
intel_engine_cs *engine,
instdone->slice_common_extra[1] =
intel_uncore_read(uncore, 
GEN12_SC_INSTDONE_EXTRA2);
}
-   for_each_instdone_slice_subslice(i915, sseu, slice, subslice) {
-   instdone->sampler[slice][subslice] =
-   read_subslice_reg(engine, slice, subslice,
- GEN7_SAMPLER_INSTDONE);
-   instdone->row[slice][subslice] =
-   read_subslice_reg(engine, slice, subslice,
- GEN7_ROW_INSTDONE);
+
+   if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) {
+   for_each_instdone_gslice_dss_xehp(i915, sseu, iter, 
slice, subslice) {
+   instdone->sampler[slice][subslice] =
+   read_subslice_reg(engine, slice, 
subslice,
+ 
GEN7_SAMPLER_INSTDONE);
+   instdone->row[slice][subslice] =
+   read_subslice_reg(engine, slice, 
subslice,
+ GEN7_ROW_INSTDONE);
+   }
+   } else {
+   for_each_instdone_slice_subslice(i915, sseu, slice, 
subslice) {
+   instdone->sampler[slice][subslice] =
+   read_subslice_reg(engine, slice, 
subslice,
+ 
GEN7_SAMPLER_INSTDONE);
+   instdone->row[slice][subslice] =
+   read_subslice_reg(engine, slice, 
subslice,
+ GEN7_ROW_INSTDONE);
+   }
}
-   break;
-   case 7:
+   } else if (GRAPHICS_VER(i915) >= 7) {
instdone->instdone =
intel_uncore_read(uncore, RING_INSTDONE(mmio_base));
 
if (engine->id != RCS0)
-   break;
+   return;
 
instdone->slice_common =
intel_uncore_read(uncore, GEN7_SC_INSTDONE);
@@ -1222,22 +1233,15 @@ void intel_engine_get_instdone(const struct 
intel_engine_cs *engine,
intel_uncore_read(uncore, GEN7_SAMPLER_INSTDONE);
instdone->row[0][0] =
intel_uncore_read(uncore, GEN7_ROW_INSTDONE);
-
-   break;
-   case 6:
-   case 5:
-   case 4:
+   } else if (GRAPHICS_VER(i915) >= 4) {
instdone->instdone =
intel_uncore_read(uncore, RING_INSTDONE(mmio_base));
if (engine->id == RCS0)
/* HACK: Using the wrong struct 

[PATCH 17/53] drm/i915/xehp: Changes to ss/eu definitions

2021-07-01 Thread Matt Roper
From: Matthew Auld 

Xe_HP no longer has "slices" in the same way that old platforms did.
There are new concepts (gslices, cslices, mslices) that apply in various
contexts, but for the purposes of fusing slices no longer exist and we
just have one large pool of dual-subslices (DSS) to work with.
Furthermore, the meaning of the DSS fuse is inverted compared to past
platforms --- it now specifies which DSS are enabled rather than which
ones are disabled.

Cc: Henryk Napiatek 
Cc: Rodrigo Vivi 
Cc: Lucas De Marchi 
Cc: Tvrtko Ursulin 
Signed-off-by: Matthew Auld 
Signed-off-by: Daniele Ceraolo Spurio 
Signed-off-by: Radhakrishna Sripada 
Signed-off-by: Stuart Summers 
Signed-off-by: Prasad Nallani 
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/gt/intel_sseu.c | 24 
 drivers/gpu/drm/i915/i915_getparam.c |  6 --
 drivers/gpu/drm/i915/i915_reg.h  |  3 +++
 3 files changed, 27 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.c 
b/drivers/gpu/drm/i915/gt/intel_sseu.c
index bbed8e8625e1..5d1b7d06c96b 100644
--- a/drivers/gpu/drm/i915/gt/intel_sseu.c
+++ b/drivers/gpu/drm/i915/gt/intel_sseu.c
@@ -139,17 +139,33 @@ static void gen12_sseu_info_init(struct intel_gt *gt)
 * Gen12 has Dual-Subslices, which behave similarly to 2 gen11 SS.
 * Instead of splitting these, provide userspace with an array
 * of DSS to more closely represent the hardware resource.
+*
+* In addition, the concept of slice has been removed in Xe_HP.
+* To be compatible with prior generations, assume a single slice
+* across the entire device. Then calculate out the DSS for each
+* workload type within that software slice.
 */
intel_sseu_set_info(sseu, 1, 6, 16);
 
-   s_en = intel_uncore_read(uncore, GEN11_GT_SLICE_ENABLE) &
-   GEN11_GT_S_ENA_MASK;
+   /*
+* As mentioned above, Xe_HP does not have the concept of a slice.
+* Enable one for software backwards compatibility.
+*/
+   if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 50))
+   s_en = 0x1;
+   else
+   s_en = intel_uncore_read(uncore, GEN11_GT_SLICE_ENABLE) &
+  GEN11_GT_S_ENA_MASK;
 
dss_en = intel_uncore_read(uncore, GEN12_GT_DSS_ENABLE);
 
/* one bit per pair of EUs */
-   eu_en_fuse = ~(intel_uncore_read(uncore, GEN11_EU_DISABLE) &
-  GEN11_EU_DIS_MASK);
+   if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 50))
+   eu_en_fuse = intel_uncore_read(uncore, XEHP_EU_ENABLE) & 
XEHP_EU_ENA_MASK;
+   else
+   eu_en_fuse = ~(intel_uncore_read(uncore, GEN11_EU_DISABLE) &
+  GEN11_EU_DIS_MASK);
+
for (eu = 0; eu < sseu->max_eus_per_subslice / 2; eu++)
if (eu_en_fuse & BIT(eu))
eu_en |= BIT(eu * 2) | BIT(eu * 2 + 1);
diff --git a/drivers/gpu/drm/i915/i915_getparam.c 
b/drivers/gpu/drm/i915/i915_getparam.c
index 24e18219eb50..e289397d9178 100644
--- a/drivers/gpu/drm/i915/i915_getparam.c
+++ b/drivers/gpu/drm/i915/i915_getparam.c
@@ -15,7 +15,7 @@ int i915_getparam_ioctl(struct drm_device *dev, void *data,
struct pci_dev *pdev = to_pci_dev(dev->dev);
const struct sseu_dev_info *sseu = >gt.info.sseu;
drm_i915_getparam_t *param = data;
-   int value;
+   int value = 0;
 
switch (param->param) {
case I915_PARAM_IRQ_ACTIVE:
@@ -150,7 +150,9 @@ int i915_getparam_ioctl(struct drm_device *dev, void *data,
return -ENODEV;
break;
case I915_PARAM_SUBSLICE_MASK:
-   value = sseu->subslice_mask[0];
+   /* Only copy bits from the first slice */
+   memcpy(, sseu->subslice_mask,
+  min(sseu->ss_stride, (u8)sizeof(value)));
if (!value)
return -ENODEV;
break;
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 43fdf63a2240..9edb58c796e8 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -3151,6 +3151,9 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
 
 #define GEN12_GT_DSS_ENABLE _MMIO(0x913C)
 
+#define XEHP_EU_ENABLE _MMIO(0x9134)
+#define XEHP_EU_ENA_MASK   0xFF
+
 #define GEN6_BSD_SLEEP_PSMI_CONTROL_MMIO(0x12050)
 #define   GEN6_BSD_SLEEP_MSG_DISABLE   (1 << 0)
 #define   GEN6_BSD_SLEEP_FLUSH_DISABLE (1 << 2)
-- 
2.25.4



[PATCH 42/53] drm/i915/dg2: Add MPLLB programming for SNPS PHY

2021-07-01 Thread Matt Roper
DG2's SNPS PHYs incorporate a dedicated port PLL called MPLLB which
takes the place of the shared DPLLs we've used on past platforms.  Let's
add the MPLLB programming sequences; they'll be plugged into the rest of
the code in future patches.

Bspec: 54032
Bspec: 53881
Cc: Lucas De Marchi 
Signed-off-by: Matt Roper 
Signed-off-by: Vandita Kulkarni 
Signed-off-by: Jani Nikula 
Signed-off-by: Nidhi Gupta 
---
 drivers/gpu/drm/i915/Makefile |   1 +
 drivers/gpu/drm/i915/display/intel_display.c  |   1 +
 .../drm/i915/display/intel_display_types.h|  17 +-
 drivers/gpu/drm/i915/display/intel_dpll.c |  12 +-
 drivers/gpu/drm/i915/display/intel_snps_phy.c | 517 ++
 drivers/gpu/drm/i915/display/intel_snps_phy.h |  18 +
 drivers/gpu/drm/i915/i915_reg.h   |  56 ++
 7 files changed, 616 insertions(+), 6 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/display/intel_snps_phy.c
 create mode 100644 drivers/gpu/drm/i915/display/intel_snps_phy.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 01f28ad5ea57..6b6c1e5a72d5 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -268,6 +268,7 @@ i915-y += \
display/intel_pps.o \
display/intel_qp_tables.o \
display/intel_sdvo.o \
+   display/intel_snps_phy.o \
display/intel_tv.o \
display/intel_vdsc.o \
display/intel_vrr.o \
diff --git a/drivers/gpu/drm/i915/display/intel_display.c 
b/drivers/gpu/drm/i915/display/intel_display.c
index dc2b943a4e72..91f6964ec406 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -59,6 +59,7 @@
 #include "display/intel_hdmi.h"
 #include "display/intel_lvds.h"
 #include "display/intel_sdvo.h"
+#include "display/intel_snps_phy.h"
 #include "display/intel_tv.h"
 #include "display/intel_vdsc.h"
 #include "display/intel_vrr.h"
diff --git a/drivers/gpu/drm/i915/display/intel_display_types.h 
b/drivers/gpu/drm/i915/display/intel_display_types.h
index d94f361b548b..29ae1d9b5abc 100644
--- a/drivers/gpu/drm/i915/display/intel_display_types.h
+++ b/drivers/gpu/drm/i915/display/intel_display_types.h
@@ -884,6 +884,18 @@ enum intel_output_format {
INTEL_OUTPUT_FORMAT_YCBCR444,
 };
 
+struct intel_mpllb_state {
+   u32 clock; /* in KHz */
+   u32 ref_control;
+   u32 mpllb_cp;
+   u32 mpllb_div;
+   u32 mpllb_div2;
+   u32 mpllb_fracn1;
+   u32 mpllb_fracn2;
+   u32 mpllb_sscen;
+   u32 mpllb_sscstep;
+};
+
 struct intel_crtc_state {
/*
 * uapi (drm) state. This is the software state shown to userspace.
@@ -1018,7 +1030,10 @@ struct intel_crtc_state {
struct intel_shared_dpll *shared_dpll;
 
/* Actual register state of the dpll, for shared dpll cross-checking. */
-   struct intel_dpll_hw_state dpll_hw_state;
+   union {
+   struct intel_dpll_hw_state dpll_hw_state;
+   struct intel_mpllb_state mpllb_state;
+   };
 
/*
 * ICL reserved DPLLs for the CRTC/port. The active PLL is selected by
diff --git a/drivers/gpu/drm/i915/display/intel_dpll.c 
b/drivers/gpu/drm/i915/display/intel_dpll.c
index 89635da9f6f6..14515e62c05e 100644
--- a/drivers/gpu/drm/i915/display/intel_dpll.c
+++ b/drivers/gpu/drm/i915/display/intel_dpll.c
@@ -11,6 +11,7 @@
 #include "intel_lvds.h"
 #include "intel_panel.h"
 #include "intel_sideband.h"
+#include "display/intel_snps_phy.h"
 
 struct intel_limit {
struct {
@@ -923,12 +924,13 @@ static int hsw_crtc_compute_clock(struct intel_crtc *crtc,
struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
struct intel_atomic_state *state =
to_intel_atomic_state(crtc_state->uapi.state);
+   struct intel_encoder *encoder =
+   intel_get_crtc_new_encoder(state, crtc_state);
 
-   if (!intel_crtc_has_type(crtc_state, INTEL_OUTPUT_DSI) ||
-   DISPLAY_VER(dev_priv) >= 11) {
-   struct intel_encoder *encoder =
-   intel_get_crtc_new_encoder(state, crtc_state);
-
+   if (IS_DG2(dev_priv)) {
+   return intel_mpllb_calc_state(crtc_state, encoder);
+   } else if (!intel_crtc_has_type(crtc_state, INTEL_OUTPUT_DSI) ||
+  DISPLAY_VER(dev_priv) >= 11) {
if (!intel_reserve_shared_dplls(state, crtc, encoder)) {
drm_dbg_kms(_priv->drm,
"failed to find PLL for pipe %c\n",
diff --git a/drivers/gpu/drm/i915/display/intel_snps_phy.c 
b/drivers/gpu/drm/i915/display/intel_snps_phy.c
new file mode 100644
index ..6d9205906595
--- /dev/null
+++ b/drivers/gpu/drm/i915/display/intel_snps_phy.c
@@ -0,0 +1,517 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2019 Intel Corporation
+ */
+
+#include "intel_de.h"
+#include "intel_display_types.h"
+#include "intel_snps_phy.h"
+
+/**
+ * DOC: 

[PATCH 34/53] drm/i915/dg2: Add cdclk table and reference clock

2021-07-01 Thread Matt Roper
Note that DG2 only has a single possible refclk frequency (38.4 MHz).

Bspec: 54034
Cc: Lucas De Marchi 
Signed-off-by: Anusha Srivatsa 
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/display/intel_cdclk.c | 24 --
 1 file changed, 22 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_cdclk.c 
b/drivers/gpu/drm/i915/display/intel_cdclk.c
index 613ffcc68eba..08f34d87684f 100644
--- a/drivers/gpu/drm/i915/display/intel_cdclk.c
+++ b/drivers/gpu/drm/i915/display/intel_cdclk.c
@@ -1290,6 +1290,18 @@ static const struct intel_cdclk_vals adlp_cdclk_table[] 
= {
{}
 };
 
+static const struct intel_cdclk_vals dg2_cdclk_table[] = {
+   { .refclk = 38400, .cdclk = 172800, .divider = 2, .ratio =  9 },
+   { .refclk = 38400, .cdclk = 179200, .divider = 3, .ratio = 14 },
+   { .refclk = 38400, .cdclk = 192000, .divider = 2, .ratio = 10 },
+   { .refclk = 38400, .cdclk = 192000, .divider = 3, .ratio = 15 },
+   { .refclk = 38400, .cdclk = 307200, .divider = 2, .ratio = 16 },
+   { .refclk = 38400, .cdclk = 326400, .divider = 4, .ratio = 34 },
+   { .refclk = 38400, .cdclk = 556800, .divider = 2, .ratio = 29 },
+   { .refclk = 38400, .cdclk = 652800, .divider = 2, .ratio = 34 },
+   {}
+};
+
 static int bxt_calc_cdclk(struct drm_i915_private *dev_priv, int min_cdclk)
 {
const struct intel_cdclk_vals *table = dev_priv->cdclk.table;
@@ -1408,7 +1420,9 @@ static void bxt_de_pll_readout(struct drm_i915_private 
*dev_priv,
 {
u32 val, ratio;
 
-   if (DISPLAY_VER(dev_priv) >= 11)
+   if (IS_DG2(dev_priv))
+   cdclk_config->ref = 38400;
+   else if (DISPLAY_VER(dev_priv) >= 11)
icl_readout_refclk(dev_priv, cdclk_config);
else if (IS_CANNONLAKE(dev_priv))
cnl_readout_refclk(dev_priv, cdclk_config);
@@ -2878,7 +2892,13 @@ u32 intel_read_rawclk(struct drm_i915_private *dev_priv)
  */
 void intel_init_cdclk_hooks(struct drm_i915_private *dev_priv)
 {
-   if (IS_ALDERLAKE_P(dev_priv)) {
+   if (IS_DG2(dev_priv)) {
+   dev_priv->display.set_cdclk = bxt_set_cdclk;
+   dev_priv->display.bw_calc_min_cdclk = skl_bw_calc_min_cdclk;
+   dev_priv->display.modeset_calc_cdclk = bxt_modeset_calc_cdclk;
+   dev_priv->display.calc_voltage_level = tgl_calc_voltage_level;
+   dev_priv->cdclk.table = dg2_cdclk_table;
+   } else if (IS_ALDERLAKE_P(dev_priv)) {
dev_priv->display.set_cdclk = bxt_set_cdclk;
dev_priv->display.bw_calc_min_cdclk = skl_bw_calc_min_cdclk;
dev_priv->display.modeset_calc_cdclk = bxt_modeset_calc_cdclk;
-- 
2.25.4



[PATCH 13/53] drm/i915/xehp: New engine context offsets

2021-07-01 Thread Matt Roper
From: Prathap Kumar Valsan 

The layout of some engine contexts has changed on Xe_HP.  Define the new
offsets.

Bspec: 45585, 46256
Signed-off-by: Prathap Kumar Valsan 
Signed-off-by: Ramalingam C 
Signed-off-by: Venkata Ramana Nayana 
Signed-off-by: Akeem G Abodunrin 
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/gt/intel_lrc.c | 65 ++---
 1 file changed, 59 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c 
b/drivers/gpu/drm/i915/gt/intel_lrc.c
index e1c80e2c06d8..fee735e2a524 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -484,6 +484,47 @@ static const u8 gen12_rcs_offsets[] = {
END
 };
 
+static const u8 xehp_rcs_offsets[] = {
+   NOP(1),
+   LRI(13, POSTED),
+   REG16(0x244),
+   REG(0x034),
+   REG(0x030),
+   REG(0x038),
+   REG(0x03c),
+   REG(0x168),
+   REG(0x140),
+   REG(0x110),
+   REG(0x1c0),
+   REG(0x1c4),
+   REG(0x1c8),
+   REG(0x180),
+   REG16(0x2b4),
+
+   NOP(5),
+   LRI(9, POSTED),
+   REG16(0x3a8),
+   REG16(0x28c),
+   REG16(0x288),
+   REG16(0x284),
+   REG16(0x280),
+   REG16(0x27c),
+   REG16(0x278),
+   REG16(0x274),
+   REG16(0x270),
+
+   LRI(3, POSTED),
+   REG(0x1b0),
+   REG16(0x5a8),
+   REG16(0x5ac),
+
+   NOP(6),
+   LRI(1, 0),
+   REG(0x0c8),
+
+   END
+};
+
 #undef END
 #undef REG16
 #undef REG
@@ -502,7 +543,9 @@ static const u8 *reg_offsets(const struct intel_engine_cs 
*engine)
   !intel_engine_has_relative_mmio(engine));
 
if (engine->class == RENDER_CLASS) {
-   if (GRAPHICS_VER(engine->i915) >= 12)
+   if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 50))
+   return xehp_rcs_offsets;
+   else if (GRAPHICS_VER(engine->i915) >= 12)
return gen12_rcs_offsets;
else if (GRAPHICS_VER(engine->i915) >= 11)
return gen11_rcs_offsets;
@@ -522,7 +565,9 @@ static const u8 *reg_offsets(const struct intel_engine_cs 
*engine)
 
 static int lrc_ring_mi_mode(const struct intel_engine_cs *engine)
 {
-   if (GRAPHICS_VER(engine->i915) >= 12)
+   if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 50))
+   return 0x70;
+   else if (GRAPHICS_VER(engine->i915) >= 12)
return 0x60;
else if (GRAPHICS_VER(engine->i915) >= 9)
return 0x54;
@@ -534,7 +579,9 @@ static int lrc_ring_mi_mode(const struct intel_engine_cs 
*engine)
 
 static int lrc_ring_gpr0(const struct intel_engine_cs *engine)
 {
-   if (GRAPHICS_VER(engine->i915) >= 12)
+   if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 50))
+   return 0x84;
+   else if (GRAPHICS_VER(engine->i915) >= 12)
return 0x74;
else if (GRAPHICS_VER(engine->i915) >= 9)
return 0x68;
@@ -578,10 +625,16 @@ static int lrc_ring_indirect_offset(const struct 
intel_engine_cs *engine)
 
 static int lrc_ring_cmd_buf_cctl(const struct intel_engine_cs *engine)
 {
-   if (engine->class != RENDER_CLASS)
-   return -1;
 
-   if (GRAPHICS_VER(engine->i915) >= 12)
+   if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 50))
+   /*
+* Note that the CSFE context has a dummy slot for CMD_BUF_CCTL
+* simply to match the RCS context image layout.
+*/
+   return 0xc6;
+   else if (engine->class != RENDER_CLASS)
+   return -1;
+   else if (GRAPHICS_VER(engine->i915) >= 12)
return 0xb6;
else if (GRAPHICS_VER(engine->i915) >= 11)
return 0xaa;
-- 
2.25.4



[PATCH 52/53] drm/i915/dg2: Update to bigjoiner path

2021-07-01 Thread Matt Roper
From: Animesh Manna 

In verify_mpllb_state() encoder is retrieved from best_encoder
of connector_state. As there will be only one connector_state
for bigjoiner and checking encoder may not be needed for
bigjoiner-slave. This code path related to mpll is done on dg2
and need this fix to avoid null pointer dereference issue.

Cc: Manasi Navare 
Signed-off-by: Animesh Manna 
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/display/intel_display.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/i915/display/intel_display.c 
b/drivers/gpu/drm/i915/display/intel_display.c
index 9655f1b1b41b..3f4e811145b6 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -9153,6 +9153,9 @@ verify_mpllb_state(struct intel_atomic_state *state,
if (!new_crtc_state->hw.active)
return;
 
+   if (new_crtc_state->bigjoiner_slave)
+   return;
+
encoder = intel_get_crtc_new_encoder(state, new_crtc_state);
intel_mpllb_readout_hw_state(encoder, _hw_state);
 
-- 
2.25.4



[PATCH 25/53] drm/i915/dg2: DG2 uses the same sseu limits as XeHP SDV

2021-07-01 Thread Matt Roper
DG2 supports compute DSS and has the same maximum number of DSS and EU
as XeHP SDV.

Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/gt/intel_sseu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.c 
b/drivers/gpu/drm/i915/gt/intel_sseu.c
index 5d3b8dff464c..eaff221db5b0 100644
--- a/drivers/gpu/drm/i915/gt/intel_sseu.c
+++ b/drivers/gpu/drm/i915/gt/intel_sseu.c
@@ -171,7 +171,7 @@ static void gen12_sseu_info_init(struct intel_gt *gt)
 * across the entire device. Then calculate out the DSS for each
 * workload type within that software slice.
 */
-   if (IS_XEHPSDV(gt->i915)) {
+   if (IS_DG2(gt->i915) || IS_XEHPSDV(gt->i915)) {
intel_sseu_set_info(sseu, 1, 32, 16);
sseu->has_compute_dss = 1;
} else {
-- 
2.25.4



[PATCH 16/53] drm/i915/xehpsdv: add initial XeHP SDV definitions

2021-07-01 Thread Matt Roper
From: Lucas De Marchi 

XeHP SDV is a Intel® dGPU without display. This is just the definition
of some basic platform macros, by large a copy of current state of
Tigerlake which does not reflect the end state of this platform.

Bspec: 44467, 48077
Cc: Rodrigo Vivi 
Signed-off-by: Lucas De Marchi 
Signed-off-by: Daniele Ceraolo Spurio 
Signed-off-by: José Roberto de Souza 
Signed-off-by: Stuart Summers 
Signed-off-by: Tomas Winkler 
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/i915_drv.h  | 10 ++
 drivers/gpu/drm/i915/i915_pci.c  | 20 
 drivers/gpu/drm/i915/intel_device_info.c |  1 +
 drivers/gpu/drm/i915/intel_device_info.h |  1 +
 4 files changed, 32 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index c02600850246..63bed18a2be7 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1406,6 +1406,7 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
 #define IS_DG1(dev_priv)IS_PLATFORM(dev_priv, INTEL_DG1)
 #define IS_ALDERLAKE_S(dev_priv) IS_PLATFORM(dev_priv, INTEL_ALDERLAKE_S)
 #define IS_ALDERLAKE_P(dev_priv) IS_PLATFORM(dev_priv, INTEL_ALDERLAKE_P)
+#define IS_XEHPSDV(dev_priv) IS_PLATFORM(dev_priv, INTEL_XEHPSDV)
 #define IS_HSW_EARLY_SDV(dev_priv) (IS_HASWELL(dev_priv) && \
(INTEL_DEVID(dev_priv) & 0xFF00) == 0x0C00)
 #define IS_BDW_ULT(dev_priv) \
@@ -1564,6 +1565,15 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
(IS_ALDERLAKE_P(__i915) && \
 IS_GT_STEP(__i915, since, until))
 
+#define XEHPSDV_REVID_A0   0x0
+#define XEHPSDV_REVID_A1   0x1
+#define XEHPSDV_REVID_A_LAST   XEHPSDV_REVID_A1
+#define XEHPSDV_REVID_B0   0x4
+#define XEHPSDV_REVID_C0   0x8
+
+#define IS_XEHPSDV_REVID(p, since, until) \
+   (IS_XEHPSDV(p) && IS_REVID(p, since, until))
+
 #define IS_LP(dev_priv)(INTEL_INFO(dev_priv)->is_lp)
 #define IS_GEN9_LP(dev_priv)   (GRAPHICS_VER(dev_priv) == 9 && IS_LP(dev_priv))
 #define IS_GEN9_BC(dev_priv)   (GRAPHICS_VER(dev_priv) == 9 && 
!IS_LP(dev_priv))
diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
index 88b279452b87..046309e95f43 100644
--- a/drivers/gpu/drm/i915/i915_pci.c
+++ b/drivers/gpu/drm/i915/i915_pci.c
@@ -1020,6 +1020,26 @@ static const struct intel_device_info adl_p_info = {
.ppgtt_size = 48, \
.ppgtt_type = INTEL_PPGTT_FULL
 
+#define XE_HPM_FEATURES \
+   .media_ver = 12, \
+   .media_ver_release = 50
+
+__maybe_unused
+static const struct intel_device_info xehpsdv_info = {
+   XE_HP_FEATURES,
+   XE_HPM_FEATURES,
+   DGFX_FEATURES,
+   PLATFORM(INTEL_XEHPSDV),
+   .display = { },
+   .pipe_mask = 0,
+   .platform_engine_mask =
+   BIT(RCS0) | BIT(BCS0) |
+   BIT(VECS0) | BIT(VECS1) | BIT(VECS2) | BIT(VECS3) |
+   BIT(VCS0) | BIT(VCS1) | BIT(VCS2) | BIT(VCS3) |
+   BIT(VCS4) | BIT(VCS5) | BIT(VCS6) | BIT(VCS7),
+   .require_force_probe = 1,
+};
+
 #undef PLATFORM
 
 /*
diff --git a/drivers/gpu/drm/i915/intel_device_info.c 
b/drivers/gpu/drm/i915/intel_device_info.c
index e8ad14f002c1..7b37b68f4548 100644
--- a/drivers/gpu/drm/i915/intel_device_info.c
+++ b/drivers/gpu/drm/i915/intel_device_info.c
@@ -68,6 +68,7 @@ static const char * const platform_names[] = {
PLATFORM_NAME(DG1),
PLATFORM_NAME(ALDERLAKE_S),
PLATFORM_NAME(ALDERLAKE_P),
+   PLATFORM_NAME(XEHPSDV),
 };
 #undef PLATFORM_NAME
 
diff --git a/drivers/gpu/drm/i915/intel_device_info.h 
b/drivers/gpu/drm/i915/intel_device_info.h
index f824de632cfe..e8684199b0c9 100644
--- a/drivers/gpu/drm/i915/intel_device_info.h
+++ b/drivers/gpu/drm/i915/intel_device_info.h
@@ -88,6 +88,7 @@ enum intel_platform {
INTEL_DG1,
INTEL_ALDERLAKE_S,
INTEL_ALDERLAKE_P,
+   INTEL_XEHPSDV,
INTEL_MAX_PLATFORMS
 };
 
-- 
2.25.4



[PATCH 28/53] drm/i915/dg2: Add SQIDI steering

2021-07-01 Thread Matt Roper
Although DG2_G10 platforms will always have all SQIDI's present and
don't need steering for registers in a SQIDI MMIO range, this isn't true
for DG2_G11 platforms; only SQIDI's 2 and 3 can be used on those.

We handle SQIDI ranges a bit differently from other types of explicit
steering.  The SQIDI ranges belong to either the MCFG unit or the SF
unit, both of which have their own dedicated steering registers and do
not use the typical 0xFDC steering control that all other types of
ranges use.  Thus we only need to worry about picking a valid initial
value for the MCFG and SF steering registers (0xFD0 and 0xFD8
resepectively) at driver init; they won't change after we set them up so
we don't need to worry about re-steering them explicitly at runtime.

Given that any SQIDI value should work fine for DG2-G10 and XeHP SDV,
while only values of 2 and 3 are valid for DG2-G11, we'll just
initialize the MCFG and SF steering registers to a constant value of "2"
for all XeHP-based platforms for simplicity --- that will work in all
cases.

Bspec: 66534
Cc: Radhakrishna Sripada 
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/gt/intel_workarounds.c | 28 +
 drivers/gpu/drm/i915/i915_reg.h |  2 ++
 2 files changed, 25 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c 
b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index 4302dc1b728e..f97ff2848122 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -944,17 +944,24 @@ cfl_gt_workarounds_init(struct drm_i915_private *i915, 
struct i915_wa_list *wal)
GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS);
 }
 
-static void __add_mcr_wa(struct drm_i915_private *i915, struct i915_wa_list 
*wal,
-unsigned slice, unsigned subslice)
+static void __set_mcr_steering(struct i915_wa_list *wal,
+  i915_reg_t steering_reg,
+  unsigned int slice, unsigned int subslice)
 {
u32 mcr, mcr_mask;
 
mcr = GEN11_MCR_SLICE(slice) | GEN11_MCR_SUBSLICE(subslice);
mcr_mask = GEN11_MCR_SLICE_MASK | GEN11_MCR_SUBSLICE_MASK;
 
-   drm_dbg(>drm, "MCR slice/subslice = %x\n", mcr);
+   wa_write_clr_set(wal, steering_reg, mcr_mask, mcr);
+}
+
+static void __add_mcr_wa(struct drm_i915_private *i915, struct i915_wa_list 
*wal,
+unsigned int slice, unsigned int subslice)
+{
+   drm_dbg(>drm, "MCR slice=0x%x, subslice=0x%x\n", slice, subslice);
 
-   wa_write_clr_set(wal, GEN8_MCR_SELECTOR, mcr_mask, mcr);
+   __set_mcr_steering(wal, GEN8_MCR_SELECTOR, slice, subslice);
 }
 
 static void
@@ -1008,7 +1015,6 @@ xehp_init_mcr(struct intel_gt *gt, struct i915_wa_list 
*wal)
 * - L3 Bank (fusable)
 * - MSLICE (fusable)
 * - LNCF (sub-unit within mslice; always present if mslice is present)
-* - SQIDI (always on)
 *
 * We'll do our default/implicit steering based on GSLICE (in the
 * sliceid field) and DSS (in the subsliceid field).  If we can
@@ -1058,6 +1064,18 @@ xehp_init_mcr(struct intel_gt *gt, struct i915_wa_list 
*wal)
WARN_ON(dss_mask >> (slice * GEN_DSS_PER_GSLICE) == 0);
 
__add_mcr_wa(i915, wal, slice, subslice);
+
+   /*
+* SQIDI ranges are special because they use different steering
+* registers than everything else we work with.  On XeHP SDV and
+* DG2-G10, any value in the steering registers will work fine since
+* all instances are present, but DG2-G11 only has SQIDI instances at
+* ID's 2 and 3, so we need to steer to one of those.  For simplicity
+* we'll just steer to a hardcoded "2" since that value will work
+* everywhere.
+*/
+   __set_mcr_steering(wal, MCFG_MCR_SELECTOR, 0, 2);
+   __set_mcr_steering(wal, SF_MCR_SELECTOR, 0, 2);
 }
 
 static void
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 2992e8585399..b19d102e0a01 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -2686,6 +2686,8 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
 #define GEN12_SC_INSTDONE_EXTRA2   _MMIO(0x7108)
 #define GEN7_SAMPLER_INSTDONE  _MMIO(0xe160)
 #define GEN7_ROW_INSTDONE  _MMIO(0xe164)
+#define MCFG_MCR_SELECTOR  _MMIO(0xfd0)
+#define SF_MCR_SELECTOR_MMIO(0xfd8)
 #define GEN8_MCR_SELECTOR  _MMIO(0xfdc)
 #define   GEN8_MCR_SLICE(slice)(((slice) & 3) << 26)
 #define   GEN8_MCR_SLICE_MASK  GEN8_MCR_SLICE(3)
-- 
2.25.4



[PATCH 19/53] drm/i915/xehpsdv: Add compute DSS type

2021-07-01 Thread Matt Roper
From: Stuart Summers 

Starting in XeHP, the concept of slice has been removed in favor of
DSS (Dual-Subslice) masks for various workload types. These workloads have
been divided into those enabled for geometry and those enabled for compute.

i915 currently maintains a single set of S/SS/EU masks for the device.
The goal of this patch set is to minimize the amount of impact to prior
generations while still giving the user maximum flexibility.

Bspec: 33117, 33118, 20376
Cc: Daniele Ceraolo Spurio 
Cc: Matt Roper 
Signed-off-by: Stuart Summers 
Signed-off-by: Steve Hampson 
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/gt/intel_sseu.c | 73 
 drivers/gpu/drm/i915/gt/intel_sseu.h |  5 +-
 drivers/gpu/drm/i915/i915_reg.h  |  3 +-
 include/uapi/drm/i915_drm.h  |  3 --
 4 files changed, 59 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.c 
b/drivers/gpu/drm/i915/gt/intel_sseu.c
index 16c0552fcd1d..5d3b8dff464c 100644
--- a/drivers/gpu/drm/i915/gt/intel_sseu.c
+++ b/drivers/gpu/drm/i915/gt/intel_sseu.c
@@ -46,11 +46,11 @@ u32 intel_sseu_get_subslices(const struct sseu_dev_info 
*sseu, u8 slice)
 }
 
 void intel_sseu_set_subslices(struct sseu_dev_info *sseu, int slice,
- u32 ss_mask)
+ u8 *subslice_mask, u32 ss_mask)
 {
int offset = slice * sseu->ss_stride;
 
-   memcpy(>subslice_mask[offset], _mask, sseu->ss_stride);
+   memcpy(_mask[offset], _mask, sseu->ss_stride);
 }
 
 unsigned int
@@ -100,14 +100,24 @@ static u16 compute_eu_total(const struct sseu_dev_info 
*sseu)
return total;
 }
 
-static void gen11_compute_sseu_info(struct sseu_dev_info *sseu,
-   u8 s_en, u32 ss_en, u16 eu_en)
+static u32 get_ss_stride_mask(struct sseu_dev_info *sseu, u8 s, u32 ss_en)
+{
+   u32 ss_mask;
+
+   ss_mask = ss_en >> (s * sseu->max_subslices);
+   ss_mask &= GENMASK(sseu->max_subslices - 1, 0);
+
+   return ss_mask;
+}
+
+static void gen11_compute_sseu_info(struct sseu_dev_info *sseu, u8 s_en,
+   u32 g_ss_en, u32 c_ss_en, u16 eu_en)
 {
int s, ss;
 
-   /* ss_en represents entire subslice mask across all slices */
+   /* g_ss_en/c_ss_en represent entire subslice mask across all slices */
GEM_BUG_ON(sseu->max_slices * sseu->max_subslices >
-  sizeof(ss_en) * BITS_PER_BYTE);
+  sizeof(g_ss_en) * BITS_PER_BYTE);
 
for (s = 0; s < sseu->max_slices; s++) {
if ((s_en & BIT(s)) == 0)
@@ -115,7 +125,23 @@ static void gen11_compute_sseu_info(struct sseu_dev_info 
*sseu,
 
sseu->slice_mask |= BIT(s);
 
-   intel_sseu_set_subslices(sseu, s, ss_en);
+   /*
+* XeHP introduces the concept of compute vs
+* geometry DSS. To reduce variation between GENs
+* around subslice usage, store a mask for both the
+* geometry and compute enabled masks, to provide
+* to user space later in QUERY_TOPOLOGY_INFO, and
+* compute a total enabled subslice count for the
+* purposes of selecting subslices to use in a
+* particular GEM context.
+*/
+   intel_sseu_set_subslices(sseu, s, sseu->compute_subslice_mask,
+get_ss_stride_mask(sseu, s, c_ss_en));
+   intel_sseu_set_subslices(sseu, s, sseu->geometry_subslice_mask,
+get_ss_stride_mask(sseu, s, g_ss_en));
+   intel_sseu_set_subslices(sseu, s, sseu->subslice_mask,
+get_ss_stride_mask(sseu, s,
+   g_ss_en | c_ss_en));
 
for (ss = 0; ss < sseu->max_subslices; ss++)
if (intel_sseu_has_subslice(sseu, s, ss))
@@ -129,7 +155,7 @@ static void gen12_sseu_info_init(struct intel_gt *gt)
 {
struct sseu_dev_info *sseu = >info.sseu;
struct intel_uncore *uncore = gt->uncore;
-   u32 dss_en;
+   u32 g_dss_en, c_dss_en = 0;
u16 eu_en = 0;
u8 eu_en_fuse;
u8 s_en;
@@ -145,10 +171,12 @@ static void gen12_sseu_info_init(struct intel_gt *gt)
 * across the entire device. Then calculate out the DSS for each
 * workload type within that software slice.
 */
-   if (IS_XEHPSDV(gt->i915))
+   if (IS_XEHPSDV(gt->i915)) {
intel_sseu_set_info(sseu, 1, 32, 16);
-   else
+   sseu->has_compute_dss = 1;
+   } else {
intel_sseu_set_info(sseu, 1, 6, 16);
+   }
 
/*
 * As mentioned above, Xe_HP does not have the concept of a slice.
@@ -160,7 +188,9 @@ static void gen12_sseu_info_init(struct intel_gt *gt)
s_en = intel_uncore_read(uncore, 

[PATCH 09/53] drm/i915/xehp: Extra media engines - Part 3 (reset)

2021-07-01 Thread Matt Roper
From: John Harrison 

Xe_HP can have a lot of extra media engines. This patch adds the reset
support for them.

Signed-off-by: John Harrison 
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/gt/intel_reset.c | 6 ++
 drivers/gpu/drm/i915/i915_reg.h   | 8 
 2 files changed, 14 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c 
b/drivers/gpu/drm/i915/gt/intel_reset.c
index 72251638d4ea..9586613ee399 100644
--- a/drivers/gpu/drm/i915/gt/intel_reset.c
+++ b/drivers/gpu/drm/i915/gt/intel_reset.c
@@ -515,8 +515,14 @@ static int gen11_reset_engines(struct intel_gt *gt,
[VCS1]  = GEN11_GRDOM_MEDIA2,
[VCS2]  = GEN11_GRDOM_MEDIA3,
[VCS3]  = GEN11_GRDOM_MEDIA4,
+   [VCS4]  = GEN11_GRDOM_MEDIA5,
+   [VCS5]  = GEN11_GRDOM_MEDIA6,
+   [VCS6]  = GEN11_GRDOM_MEDIA7,
+   [VCS7]  = GEN11_GRDOM_MEDIA8,
[VECS0] = GEN11_GRDOM_VECS,
[VECS1] = GEN11_GRDOM_VECS2,
+   [VECS2] = GEN11_GRDOM_VECS3,
+   [VECS3] = GEN11_GRDOM_VECS4,
};
struct intel_engine_cs *engine;
intel_engine_mask_t tmp;
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index cb1716b6ce72..dbc233442dd0 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -395,10 +395,18 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
 #define  GEN11_GRDOM_MEDIA2(1 << 6)
 #define  GEN11_GRDOM_MEDIA3(1 << 7)
 #define  GEN11_GRDOM_MEDIA4(1 << 8)
+#define  GEN11_GRDOM_MEDIA5(1 << 9)
+#define  GEN11_GRDOM_MEDIA6(1 << 10)
+#define  GEN11_GRDOM_MEDIA7(1 << 11)
+#define  GEN11_GRDOM_MEDIA8(1 << 12)
 #define  GEN11_GRDOM_VECS  (1 << 13)
 #define  GEN11_GRDOM_VECS2 (1 << 14)
+#define  GEN11_GRDOM_VECS3 (1 << 15)
+#define  GEN11_GRDOM_VECS4 (1 << 16)
 #define  GEN11_GRDOM_SFC0  (1 << 17)
 #define  GEN11_GRDOM_SFC1  (1 << 18)
+#define  GEN11_GRDOM_SFC2  (1 << 19)
+#define  GEN11_GRDOM_SFC3  (1 << 20)
 
 #define  GEN11_VCS_SFC_RESET_BIT(instance) (GEN11_GRDOM_SFC0 << 
((instance) >> 1))
 #define  GEN11_VECS_SFC_RESET_BIT(instance)(GEN11_GRDOM_SFC0 << (instance))
-- 
2.25.4



[PATCH 27/53] drm/i915/dg2: Update LNCF steering ranges

2021-07-01 Thread Matt Roper
DG2's replicated register ranges are almost the same at XeHP SDV with
the exception of one LNCF sub-range that switches to gslice steering.
We can re-use the XeHP SDV mslice steering table and just provide a
DG2-specific LNCF steering table.

Bspec: 66534
Cc: Daniele Ceraolo Spurio 
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/gt/intel_gt.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c 
b/drivers/gpu/drm/i915/gt/intel_gt.c
index 9d1c99c9c0dd..d640fd37792f 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
@@ -103,6 +103,12 @@ static const struct intel_mmio_range 
xehpsdv_lncf_steering_table[] = {
{},
 };
 
+static const struct intel_mmio_range dg2_lncf_steering_table[] = {
+   { 0x00B000, 0x00B0FF },
+   { 0x00D880, 0x00D8FF },
+   {},
+};
+
 static u16 slicemask(struct intel_gt *gt, int count)
 {
u64 dss_mask = intel_sseu_get_subslices(>info.sseu, 0);
@@ -127,7 +133,10 @@ int intel_gt_init_mmio(struct intel_gt *gt)
(intel_uncore_read(gt->uncore, GEN10_MIRROR_FUSE3) &
 GEN12_MEML3_EN_MASK);
 
-   if (IS_XEHPSDV(gt->i915)) {
+   if (IS_DG2(gt->i915)) {
+   gt->steering_table[MSLICE] = xehpsdv_mslice_steering_table;
+   gt->steering_table[LNCF] = dg2_lncf_steering_table;
+   } else if (IS_XEHPSDV(gt->i915)) {
gt->steering_table[MSLICE] = xehpsdv_mslice_steering_table;
gt->steering_table[LNCF] = xehpsdv_lncf_steering_table;
} else if (GRAPHICS_VER(gt->i915) >= 11 &&
-- 
2.25.4



[PATCH 24/53] drm/i915/dg2: add DG2 platform info

2021-07-01 Thread Matt Roper
DG2 has Xe_LPD display (version 13) and Xe_HPG (version 12.55) graphics.
There are two variants (treated as subplatforms in the code):  DG2-G10
and DG2-G11 that require independent programming in some areas (e.g.,
workarounds).

Bspec: 44472, 44474, 46197, 48028, 48077
Cc: Anusha Srivatsa 
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/i915_drv.h  | 27 
 drivers/gpu/drm/i915/i915_pci.c  | 16 ++
 drivers/gpu/drm/i915/intel_device_info.c |  1 +
 drivers/gpu/drm/i915/intel_device_info.h |  5 +
 drivers/gpu/drm/i915/intel_step.c| 20 +-
 drivers/gpu/drm/i915/intel_step.h|  1 +
 6 files changed, 69 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 63bed18a2be7..828ad607795a 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1407,6 +1407,11 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
 #define IS_ALDERLAKE_S(dev_priv) IS_PLATFORM(dev_priv, INTEL_ALDERLAKE_S)
 #define IS_ALDERLAKE_P(dev_priv) IS_PLATFORM(dev_priv, INTEL_ALDERLAKE_P)
 #define IS_XEHPSDV(dev_priv) IS_PLATFORM(dev_priv, INTEL_XEHPSDV)
+#define IS_DG2(dev_priv)   IS_PLATFORM(dev_priv, INTEL_DG2)
+#define IS_DG2_G10(dev_priv) \
+   IS_SUBPLATFORM(dev_priv, INTEL_DG2, INTEL_SUBPLATFORM_G10)
+#define IS_DG2_G11(dev_priv) \
+   IS_SUBPLATFORM(dev_priv, INTEL_DG2, INTEL_SUBPLATFORM_G11)
 #define IS_HSW_EARLY_SDV(dev_priv) (IS_HASWELL(dev_priv) && \
(INTEL_DEVID(dev_priv) & 0xFF00) == 0x0C00)
 #define IS_BDW_ULT(dev_priv) \
@@ -1574,6 +1579,28 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
 #define IS_XEHPSDV_REVID(p, since, until) \
(IS_XEHPSDV(p) && IS_REVID(p, since, until))
 
+/*
+ * DG2 hardware steppings are a bit unusual.  The hardware design was forked
+ * to create two variants (G10 and G11) which have distinct workaround sets.
+ * The G11 fork of the DG2 design resets the GT stepping back to "A0" for its
+ * first iteration, even though it's more similar to a G10 B0 stepping in terms
+ * of functionality and workarounds.  However the display stepping does not
+ * reset in the same manner --- a specific stepping like "B0" has a consistent
+ * meaning regardless of whether it belongs to a G10 or G11 DG2.
+ *
+ * TLDR:  All GT workarounds and stepping-specific logic must be applied in
+ * relation to a specific subplatform (G10 or G11), whereas display workarounds
+ * and stepping-specific logic will be applied with a general DG2-wide stepping
+ * number.
+ */
+#define IS_DG2_GT_STEP(__i915, variant, since, until) \
+   (IS_SUBPLATFORM(__i915, INTEL_DG2, INTEL_SUBPLATFORM_##variant) && \
+IS_GT_STEP(__i915, since, until))
+
+#define IS_DG2_DISP_STEP(__i915, since, until) \
+   (IS_DG2(__i915) && \
+IS_DISPLAY_STEP(__i915, since, until))
+
 #define IS_LP(dev_priv)(INTEL_INFO(dev_priv)->is_lp)
 #define IS_GEN9_LP(dev_priv)   (GRAPHICS_VER(dev_priv) == 9 && IS_LP(dev_priv))
 #define IS_GEN9_BC(dev_priv)   (GRAPHICS_VER(dev_priv) == 9 && 
!IS_LP(dev_priv))
diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
index 046309e95f43..a41e1792c0ff 100644
--- a/drivers/gpu/drm/i915/i915_pci.c
+++ b/drivers/gpu/drm/i915/i915_pci.c
@@ -1040,6 +1040,22 @@ static const struct intel_device_info xehpsdv_info = {
.require_force_probe = 1,
 };
 
+__maybe_unused
+static const struct intel_device_info dg2_info = {
+   XE_HP_FEATURES,
+   XE_HPM_FEATURES,
+   XE_LPD_FEATURES,
+   DGFX_FEATURES,
+   .graphics_ver_release = 55,
+   .media_ver_release = 55,
+   PLATFORM(INTEL_DG2),
+   .platform_engine_mask =
+   BIT(RCS0) | BIT(BCS0) |
+   BIT(VECS0) | BIT(VECS1) |
+   BIT(VCS0) | BIT(VCS2),
+   .require_force_probe = 1,
+};
+
 #undef PLATFORM
 
 /*
diff --git a/drivers/gpu/drm/i915/intel_device_info.c 
b/drivers/gpu/drm/i915/intel_device_info.c
index 7b37b68f4548..41205dc356b7 100644
--- a/drivers/gpu/drm/i915/intel_device_info.c
+++ b/drivers/gpu/drm/i915/intel_device_info.c
@@ -69,6 +69,7 @@ static const char * const platform_names[] = {
PLATFORM_NAME(ALDERLAKE_S),
PLATFORM_NAME(ALDERLAKE_P),
PLATFORM_NAME(XEHPSDV),
+   PLATFORM_NAME(DG2),
 };
 #undef PLATFORM_NAME
 
diff --git a/drivers/gpu/drm/i915/intel_device_info.h 
b/drivers/gpu/drm/i915/intel_device_info.h
index e8684199b0c9..856f0aa5d68f 100644
--- a/drivers/gpu/drm/i915/intel_device_info.h
+++ b/drivers/gpu/drm/i915/intel_device_info.h
@@ -89,6 +89,7 @@ enum intel_platform {
INTEL_ALDERLAKE_S,
INTEL_ALDERLAKE_P,
INTEL_XEHPSDV,
+   INTEL_DG2,
INTEL_MAX_PLATFORMS
 };
 
@@ -107,6 +108,10 @@ enum intel_platform {
 /* CNL/ICL */
 #define INTEL_SUBPLATFORM_PORTF(0)
 
+/* DG2 */
+#define INTEL_SUBPLATFORM_G10  0
+#define 

[PATCH 01/53] drm/i915: Add "release id" version

2021-07-01 Thread Matt Roper
From: Lucas De Marchi 

Besides the arch version returned by GRAPHICS_VER(), new platforms
contain a "release id" to make clear the difference from one platform to
another. Although for the first ones we may use them as if they were a
major/minor version, that is not true for all platforms: we may have a
`release_id == n` that is closer to `n - 2` than to `n - 1`.

However the release id number is not defined by hardware until we start
using the GMD_ID register. For the platforms before that register is
useful we will set the values in software and we can set them as we
please. So the plan is to set them so we can group different features
under a single GRAPHICS_VER_FULL() check.

After GMD_ID is used, the usefulness of a "full version check" will be
greatly reduced and will be mostly used for deciding workarounds and a
few code paths. So it makes sense to keep it as a separate field from
graphics_ver.

Also, currently there is not much use for the release id in media and
display, so keep them out.

This is a mix of 2 independent changes: one by me and the other by Matt
Roper.

Cc: Matt Roper 
Signed-off-by: Lucas De Marchi 
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/i915_drv.h  | 6 ++
 drivers/gpu/drm/i915/intel_device_info.c | 2 ++
 drivers/gpu/drm/i915/intel_device_info.h | 2 ++
 3 files changed, 10 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 6dff4ca01241..9639800485b9 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1258,11 +1258,17 @@ static inline struct drm_i915_private 
*pdev_to_i915(struct pci_dev *pdev)
  */
 #define IS_GEN(dev_priv, n)(GRAPHICS_VER(dev_priv) == (n))
 
+#define IP_VER(ver, release)   ((ver) << 8 | (release))
+
 #define GRAPHICS_VER(i915) (INTEL_INFO(i915)->graphics_ver)
+#define GRAPHICS_VER_FULL(i915)
IP_VER(INTEL_INFO(i915)->graphics_ver, \
+  
INTEL_INFO(i915)->graphics_ver_release)
 #define IS_GRAPHICS_VER(i915, from, until) \
(GRAPHICS_VER(i915) >= (from) && GRAPHICS_VER(i915) <= (until))
 
 #define MEDIA_VER(i915)(INTEL_INFO(i915)->media_ver)
+#define MEDIA_VER_FULL(i915)   IP_VER(INTEL_INFO(i915)->media_ver, \
+  
INTEL_INFO(i915)->media_ver_release)
 #define IS_MEDIA_VER(i915, from, until) \
(MEDIA_VER(i915) >= (from) && MEDIA_VER(i915) <= (until))
 
diff --git a/drivers/gpu/drm/i915/intel_device_info.c 
b/drivers/gpu/drm/i915/intel_device_info.c
index 7eaa92fee421..e8ad14f002c1 100644
--- a/drivers/gpu/drm/i915/intel_device_info.c
+++ b/drivers/gpu/drm/i915/intel_device_info.c
@@ -97,7 +97,9 @@ void intel_device_info_print_static(const struct 
intel_device_info *info,
struct drm_printer *p)
 {
drm_printf(p, "graphics_ver: %u\n", info->graphics_ver);
+   drm_printf(p, "graphics_ver_release: %u\n", info->graphics_ver_release);
drm_printf(p, "media_ver: %u\n", info->media_ver);
+   drm_printf(p, "media_ver_release: %u\n", info->media_ver_release);
drm_printf(p, "display_ver: %u\n", info->display.ver);
drm_printf(p, "gt: %d\n", info->gt);
drm_printf(p, "iommu: %s\n", iommu_name());
diff --git a/drivers/gpu/drm/i915/intel_device_info.h 
b/drivers/gpu/drm/i915/intel_device_info.h
index b326aff65cd6..944a5ff4df49 100644
--- a/drivers/gpu/drm/i915/intel_device_info.h
+++ b/drivers/gpu/drm/i915/intel_device_info.h
@@ -162,7 +162,9 @@ enum intel_ppgtt_type {
 
 struct intel_device_info {
u8 graphics_ver;
+   u8 graphics_ver_release;
u8 media_ver;
+   u8 media_ver_release;
 
u8 gt; /* GT number, 0 if undefined */
intel_engine_mask_t platform_engine_mask; /* Engines supported by the 
HW */
-- 
2.25.4



[PATCH 05/53] drm/i915/gen12: Use fuse info to enable SFC

2021-07-01 Thread Matt Roper
From: Venkata Sandeep Dhanalakota 

In Gen12 there are various fuse combinations and in each configuration
vdbox engine may be connected to SFC depending on which engines are
available, so we need to set the SFC capability based on fuse value from
the hardware. Even numbered phyical instance always have SFC, odd
numbered physical instances have SFC only if previous even instance is
fused off.

Bspec: 48028
Cc: Tvrtko Ursulin 
Cc: Daniele Ceraolo Spurio 
Signed-off-by: Venkata Sandeep Dhanalakota 
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c | 30 ++-
 1 file changed, 24 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 151870d8fdd3..4ab2c9abb943 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -442,6 +442,28 @@ void intel_engines_free(struct intel_gt *gt)
}
 }
 
+static inline
+bool vdbox_has_sfc(struct drm_i915_private *i915, unsigned int physical_vdbox,
+  unsigned int logical_vdbox, u16 vdbox_mask)
+{
+   /*
+* In Gen11, only even numbered logical VDBOXes are hooked
+* up to an SFC (Scaler & Format Converter) unit.
+* In Gen12, Even numbered phyical instance always are connected
+* to an SFC. Odd numbered physical instances have SFC only if
+* previous even instance is fused off.
+*/
+   if (GRAPHICS_VER(i915) == 12) {
+   return (physical_vdbox % 2 == 0) ||
+   !(BIT(physical_vdbox - 1) & vdbox_mask);
+   } else if (GRAPHICS_VER(i915) == 11) {
+   return logical_vdbox % 2 == 0;
+   }
+
+   MISSING_CASE(GRAPHICS_VER(i915));
+   return false;
+}
+
 /*
  * Determine which engines are fused off in our particular hardware.
  * Note that we have a catch-22 situation where we need to be able to access
@@ -493,13 +515,9 @@ static intel_engine_mask_t init_engine_mask(struct 
intel_gt *gt)
continue;
}
 
-   /*
-* In Gen11, only even numbered logical VDBOXes are
-* hooked up to an SFC (Scaler & Format Converter) unit.
-* In TGL each VDBOX has access to an SFC.
-*/
-   if (GRAPHICS_VER(i915) >= 12 || logical_vdbox++ % 2 == 0)
+   if (vdbox_has_sfc(i915, i, logical_vdbox, vdbox_mask))
gt->info.vdbox_sfc_access |= BIT(i);
+   logical_vdbox++;
}
drm_dbg(>drm, "vdbox enable: %04x, instances: %04lx\n",
vdbox_mask, VDBOX_MASK(gt));
-- 
2.25.4



[PATCH 02/53] drm/i915: Add XE_HP initial definitions

2021-07-01 Thread Matt Roper
From: Lucas De Marchi 

Our _FEATURES macro went back to GEN7, extending each other, making it
difficult to grasp what was really enabled/disabled. Take the
opportunity of the GEN -> XE_HP name break and also break with the
feature inheritance.

For XE_HP this basically goes from GEN12 back to GEN7 coalescing the
features making sure the overrides remain, remove all the
display-specific features and sort it.

Then also remove the definitions that would be overridden by
DGFX_FEATURES and those that were 0 (since that is the default).
Exception here is has_master_unit_irq: although it is a feature that
started with DG1 and is true for all DGFX platforms, it's also true for
XE_HP in general.

Signed-off-by: Lucas De Marchi 
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/i915_pci.c | 24 
 1 file changed, 24 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
index a7bfdd827bc8..dc0883bad9cf 100644
--- a/drivers/gpu/drm/i915/i915_pci.c
+++ b/drivers/gpu/drm/i915/i915_pci.c
@@ -995,6 +995,30 @@ static const struct intel_device_info adl_p_info = {
 };
 
 #undef GEN
+
+#define XE_HP_PAGE_SIZES \
+   .page_sizes = I915_GTT_PAGE_SIZE_4K | \
+ I915_GTT_PAGE_SIZE_64K | \
+ I915_GTT_PAGE_SIZE_2M
+
+#define XE_HP_FEATURES \
+   .graphics_ver = 12, \
+   .graphics_ver_release = 50, \
+   XE_HP_PAGE_SIZES, \
+   .dma_mask_size = 46, \
+   .has_64bit_reloc = 1, \
+   .has_global_mocs = 1, \
+   .has_gt_uc = 1, \
+   .has_llc = 1, \
+   .has_logical_ring_contexts = 1, \
+   .has_logical_ring_elsq = 1, \
+   .has_rc6 = 1, \
+   .has_reset_engine = 1, \
+   .has_rps = 1, \
+   .has_runtime_pm = 1, \
+   .ppgtt_size = 48, \
+   .ppgtt_type = INTEL_PPGTT_FULL
+
 #undef PLATFORM
 
 /*
-- 
2.25.4



[PATCH 08/53] drm/i915/xehp: Extra media engines - Part 2 (interrupts)

2021-07-01 Thread Matt Roper
From: John Harrison 

Xe_HP can have a lot of extra media engines. This patch adds the
interrupt handler support for them.

Cc: Tvrtko Ursulin 
Cc: Daniele Ceraolo Spurio 
Signed-off-by: John Harrison 
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/gt/intel_gt_irq.c | 13 -
 drivers/gpu/drm/i915/i915_reg.h|  3 +++
 2 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt_irq.c 
b/drivers/gpu/drm/i915/gt/intel_gt_irq.c
index c13462274fe8..b2de83be4d97 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_irq.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_irq.c
@@ -184,7 +184,13 @@ void gen11_gt_irq_reset(struct intel_gt *gt)
intel_uncore_write(uncore, GEN11_BCS_RSVD_INTR_MASK,~0);
intel_uncore_write(uncore, GEN11_VCS0_VCS1_INTR_MASK,   ~0);
intel_uncore_write(uncore, GEN11_VCS2_VCS3_INTR_MASK,   ~0);
+   if (HAS_ENGINE(gt, VCS4) || HAS_ENGINE(gt, VCS5))
+   intel_uncore_write(uncore, GEN12_VCS4_VCS5_INTR_MASK,   ~0);
+   if (HAS_ENGINE(gt, VCS6) || HAS_ENGINE(gt, VCS7))
+   intel_uncore_write(uncore, GEN12_VCS6_VCS7_INTR_MASK,   ~0);
intel_uncore_write(uncore, GEN11_VECS0_VECS1_INTR_MASK, ~0);
+   if (HAS_ENGINE(gt, VECS2) || HAS_ENGINE(gt, VECS3))
+   intel_uncore_write(uncore, GEN12_VECS2_VECS3_INTR_MASK, ~0);
 
intel_uncore_write(uncore, GEN11_GPM_WGBOXPERF_INTR_ENABLE, 0);
intel_uncore_write(uncore, GEN11_GPM_WGBOXPERF_INTR_MASK,  ~0);
@@ -218,8 +224,13 @@ void gen11_gt_irq_postinstall(struct intel_gt *gt)
intel_uncore_write(uncore, GEN11_BCS_RSVD_INTR_MASK, ~smask);
intel_uncore_write(uncore, GEN11_VCS0_VCS1_INTR_MASK, ~dmask);
intel_uncore_write(uncore, GEN11_VCS2_VCS3_INTR_MASK, ~dmask);
+   if (HAS_ENGINE(gt, VCS4) || HAS_ENGINE(gt, VCS5))
+   intel_uncore_write(uncore, GEN12_VCS4_VCS5_INTR_MASK, ~dmask);
+   if (HAS_ENGINE(gt, VCS6) || HAS_ENGINE(gt, VCS7))
+   intel_uncore_write(uncore, GEN12_VCS6_VCS7_INTR_MASK, ~dmask);
intel_uncore_write(uncore, GEN11_VECS0_VECS1_INTR_MASK, ~dmask);
-
+   if (HAS_ENGINE(gt, VECS2) || HAS_ENGINE(gt, VECS3))
+   intel_uncore_write(uncore, GEN12_VECS2_VECS3_INTR_MASK, ~dmask);
/*
 * RPS interrupts will get enabled/disabled on demand when RPS itself
 * is enabled/disabled.
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index d4546e871833..cb1716b6ce72 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -8076,7 +8076,10 @@ enum {
 #define GEN11_BCS_RSVD_INTR_MASK   _MMIO(0x1900a0)
 #define GEN11_VCS0_VCS1_INTR_MASK  _MMIO(0x1900a8)
 #define GEN11_VCS2_VCS3_INTR_MASK  _MMIO(0x1900ac)
+#define GEN12_VCS4_VCS5_INTR_MASK  _MMIO(0x1900b0)
+#define GEN12_VCS6_VCS7_INTR_MASK  _MMIO(0x1900b4)
 #define GEN11_VECS0_VECS1_INTR_MASK_MMIO(0x1900d0)
+#define GEN12_VECS2_VECS3_INTR_MASK_MMIO(0x1900d4)
 #define GEN11_GUC_SG_INTR_MASK _MMIO(0x1900e8)
 #define GEN11_GPM_WGBOXPERF_INTR_MASK  _MMIO(0x1900ec)
 #define GEN11_CRYPTO_RSVD_INTR_MASK_MMIO(0x1900f0)
-- 
2.25.4



[PATCH 10/53] drm/i915/xehp: Xe_HP forcewake support

2021-07-01 Thread Matt Roper
Implement Xe_HP forcewake handling.  While we're at it, let's reorder to
the forcewake assignment if/else ladder to match our usual driver
conventions.

Co-authored-by: Daniele Ceraolo Spurio 
Signed-off-by: Daniele Ceraolo Spurio 
Signed-off-by: Stuart Summers 
Signed-off-by: Matt Roper 
---
 .../drm/i915/gt/intel_execlists_submission.c  |   4 +
 drivers/gpu/drm/i915/intel_uncore.c   | 336 +++---
 drivers/gpu/drm/i915/intel_uncore.h   |  14 +-
 drivers/gpu/drm/i915/selftests/intel_uncore.c |   2 +
 4 files changed, 302 insertions(+), 54 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c 
b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
index cdb2126a159a..15ba0d83151a 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -3318,6 +3318,10 @@ int intel_execlists_submission_setup(struct 
intel_engine_cs *engine)
i915_mmio_reg_offset(RING_EXECLIST_SQ_CONTENTS(base));
execlists->ctrl_reg = uncore->regs +
i915_mmio_reg_offset(RING_EXECLIST_CONTROL(base));
+
+   engine->fw_domain = 
intel_uncore_forcewake_for_reg(engine->uncore,
+   RING_EXECLIST_CONTROL(engine->mmio_base),
+   FW_REG_WRITE);
} else {
execlists->submit_reg = uncore->regs +
i915_mmio_reg_offset(RING_ELSP(base));
diff --git a/drivers/gpu/drm/i915/intel_uncore.c 
b/drivers/gpu/drm/i915/intel_uncore.c
index d067524f9162..676b0052f01e 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -24,6 +24,8 @@
 #include 
 #include 
 
+#include "gt/intel_lrc_reg.h" /* for shadow reg list */
+
 #include "i915_drv.h"
 #include "i915_trace.h"
 #include "i915_vgpu.h"
@@ -68,8 +70,14 @@ static const char * const forcewake_domain_names[] = {
"vdbox1",
"vdbox2",
"vdbox3",
+   "vdbox4",
+   "vdbox5",
+   "vdbox6",
+   "vdbox7",
"vebox0",
"vebox1",
+   "vebox2",
+   "vebox3",
 };
 
 const char *
@@ -952,30 +960,80 @@ static const i915_reg_t gen8_shadowed_regs[] = {
 };
 
 static const i915_reg_t gen11_shadowed_regs[] = {
-   RING_TAIL(RENDER_RING_BASE),/* 0x2000 (base) */
-   GEN6_RPNSWREQ,  /* 0xA008 */
-   GEN6_RC_VIDEO_FREQ, /* 0xA00C */
-   RING_TAIL(BLT_RING_BASE),   /* 0x22000 (base) */
-   RING_TAIL(GEN11_BSD_RING_BASE), /* 0x1C (base) */
-   RING_TAIL(GEN11_BSD2_RING_BASE),/* 0x1C4000 (base) */
-   RING_TAIL(GEN11_VEBOX_RING_BASE),   /* 0x1C8000 (base) */
-   RING_TAIL(GEN11_BSD3_RING_BASE),/* 0x1D (base) */
-   RING_TAIL(GEN11_BSD4_RING_BASE),/* 0x1D4000 (base) */
-   RING_TAIL(GEN11_VEBOX2_RING_BASE),  /* 0x1D8000 (base) */
+   RING_TAIL(RENDER_RING_BASE),/* 0x2000 (base) */
+   RING_EXECLIST_CONTROL(RENDER_RING_BASE),/* 0x2550 */
+   GEN6_RPNSWREQ,  /* 0xA008 */
+   GEN6_RC_VIDEO_FREQ, /* 0xA00C */
+   RING_TAIL(BLT_RING_BASE),   /* 0x22000 (base) */
+   RING_EXECLIST_CONTROL(BLT_RING_BASE),   /* 0x22550 */
+   RING_TAIL(GEN11_BSD_RING_BASE), /* 0x1C (base) */
+   RING_EXECLIST_CONTROL(GEN11_BSD_RING_BASE), /* 0x1C0550 */
+   RING_TAIL(GEN11_BSD2_RING_BASE),/* 0x1C4000 (base) */
+   RING_EXECLIST_CONTROL(GEN11_BSD2_RING_BASE),/* 0x1C4550 */
+   RING_TAIL(GEN11_VEBOX_RING_BASE),   /* 0x1C8000 (base) */
+   RING_EXECLIST_CONTROL(GEN11_VEBOX_RING_BASE),   /* 0x1C8550 */
+   RING_TAIL(GEN11_BSD3_RING_BASE),/* 0x1D (base) */
+   RING_EXECLIST_CONTROL(GEN11_BSD3_RING_BASE),/* 0x1D0550 */
+   RING_TAIL(GEN11_BSD4_RING_BASE),/* 0x1D4000 (base) */
+   RING_EXECLIST_CONTROL(GEN11_BSD4_RING_BASE),/* 0x1D4550 */
+   RING_TAIL(GEN11_VEBOX2_RING_BASE),  /* 0x1D8000 (base) */
+   RING_EXECLIST_CONTROL(GEN11_VEBOX2_RING_BASE),  /* 0x1D8550 */
/* TODO: Other registers are not yet used */
 };
 
 static const i915_reg_t gen12_shadowed_regs[] = {
-   RING_TAIL(RENDER_RING_BASE),/* 0x2000 (base) */
-   GEN6_RPNSWREQ,  /* 0xA008 */
-   GEN6_RC_VIDEO_FREQ, /* 0xA00C */
-   RING_TAIL(BLT_RING_BASE),   /* 0x22000 (base) */
-   RING_TAIL(GEN11_BSD_RING_BASE), /* 0x1C (base) */
-   RING_TAIL(GEN11_BSD2_RING_BASE),/* 0x1C4000 (base) */
-   RING_TAIL(GEN11_VEBOX_RING_BASE),   /* 0x1C8000 (base) */
-   RING_TAIL(GEN11_BSD3_RING_BASE),/* 0x1D (base) */
-   

[PATCH 11/53] drm/i915/xehp: Define multicast register ranges

2021-07-01 Thread Matt Roper
Since we can't steer multicast register reads during ring-based
workaround verification, we need to define the multicast ranges where
failure to steer could potentially cause us to read back from a
fused-off register instance.

As with gen12, we can ignore the multicast ranges that the bspec
describes as 'SQIDI' since all instances of those registers will always
be present and we'll always be able to read back a workaround value that
was written with multicast.

Bspec: 66534
Cc: José Roberto de Souza 
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/gt/intel_workarounds.c | 20 +++-
 1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c 
b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index d9a5a445ceec..20c6ca28e407 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -2089,12 +2089,30 @@ static const struct mcr_range mcr_ranges_gen12[] = {
{},
 };
 
+static const struct mcr_range mcr_ranges_xehp[] = {
+   { .start =  0x4000, .end =  0x4aff },
+   { .start =  0x5200, .end =  0x52ff },
+   { .start =  0x5400, .end =  0x7fff },
+   { .start =  0x8140, .end =  0x815f },
+   { .start =  0x8c80, .end =  0x8dff },
+   { .start =  0x94d0, .end =  0x955f },
+   { .start =  0x9680, .end =  0x96ff },
+   { .start =  0xb000, .end =  0xb3ff },
+   { .start =  0xc800, .end =  0xcfff },
+   { .start =  0xd800, .end =  0xd8ff },
+   { .start =  0xdc00, .end =  0x },
+   { .start = 0x17000, .end = 0x17fff },
+   { .start = 0x24a00, .end = 0x24a7f },
+};
+
 static bool mcr_range(struct drm_i915_private *i915, u32 offset)
 {
const struct mcr_range *mcr_ranges;
int i;
 
-   if (GRAPHICS_VER(i915) >= 12)
+   if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50))
+   mcr_ranges = mcr_ranges_xehp;
+   else if (GRAPHICS_VER(i915) >= 12)
mcr_ranges = mcr_ranges_gen12;
else if (GRAPHICS_VER(i915) >= 8)
mcr_ranges = mcr_ranges_gen8;
-- 
2.25.4



[PATCH 00/53] Begin enabling Xe_HP SDV and DG2 platforms

2021-07-01 Thread Matt Roper
This series provides some of the initial enablement patches for two
upcoming discrete GPUs:
 * XeHP SDV:  Xe_HP (version 12.50) graphics IP, no display IP
 * DG2:  Xe_HPG (version 12.55) graphics IP, Xe_LPD (version 13) display IP

Both platforms will need additional enablement patches beyond what's
present in this series before they're truly usable, including various
LMEM and GuC work that's already happening separately.  The new
features/functionality that these platforms bring (such as multi-tile
support, dedicated compute engines, etc.) may be referenced in passing
in some of these patches but will be fully enabled in future series.

Cc: Rodrigo Vivi 
Cc: Lucas De Marchi 
Cc: James Ausmus 


Akeem G Abodunrin (1):
  drm/i915/dg2: Add new LRI reg offsets

Animesh Manna (1):
  drm/i915/dg2: Update to bigjoiner path

Ankit Nautiyal (1):
  drm/i915/dg2: Configure PCON in DP pre-enable path

Anusha Srivatsa (2):
  drm/i915/display/dsc: Add Per connector debugfs node for DSC BPP
enable
  drm/i915/display/dsc: Set BPP in the kernel

Daniele Ceraolo Spurio (1):
  drm/i915/xehp: handle new steering options

Gwan-gyeong Mun (1):
  drm/i915/dg2: Update lane disable power state during PSR

John Harrison (4):
  drm/i915/selftests: Allow for larger engine counts
  drm/i915/xehp: Extra media engines - Part 1 (engine definitions)
  drm/i915/xehp: Extra media engines - Part 2 (interrupts)
  drm/i915/xehp: Extra media engines - Part 3 (reset)

José Roberto de Souza (1):
  drm/i915/dg2: Add DG2 to the PSR2 defeature list

Lucas De Marchi (5):
  drm/i915: Add "release id" version
  drm/i915: Add XE_HP initial definitions
  drm/i915/xehpsdv: add initial XeHP SDV definitions
  drm/i915/xehpsdv: Define MOCS table for XeHP SDV
  drm/i915/xehpsdv: factor out function to read RP_STATE_CAP

Matt Roper (29):
  drm/i915/xehp: Xe_HP forcewake support
  drm/i915/xehp: Define multicast register ranges
  drm/i915/xehp: Loop over all gslices for INSTDONE processing
  drm/i915/xehpsdv: Add maximum sseu limits
  drm/i915/xehpsdv: Define steering tables
  drm/i915/xehpsdv: Read correct RP_STATE_CAP register
  drm/i915/dg2: add DG2 platform info
  drm/i915/dg2: DG2 uses the same sseu limits as XeHP SDV
  drm/i915/dg2: Add forcewake table
  drm/i915/dg2: Update LNCF steering ranges
  drm/i915/dg2: Add SQIDI steering
  drm/i915/dg2: Maintain backward-compatible nested batch behavior
  drm/i915/dg2: Report INSTDONE_GEOM values in error state
  drm/i915/dg2: Define MOCS table for DG2
  drm/i915/dg2: Add fake PCH
  drm/i915/dg2: Add cdclk table and reference clock
  drm/i915/dg2: Skip shared DPLL handling
  drm/i915/dg2: Don't wait for AUX power well enable ACKs
  drm/i915/dg2: Setup display outputs
  drm/i915/dg2: Add dbuf programming
  drm/i915/dg2: Don't program BW_BUDDY registers
  drm/i915/dg2: Don't read DRAM info
  drm/i915/dg2: DG2 has fixed memory bandwidth
  drm/i915/dg2: Add MPLLB programming for SNPS PHY
  drm/i915/dg2: Add MPLLB programming for HDMI
  drm/i915/dg2: Add vswing programming for SNPS phys
  drm/i915/dg2: Update modeset sequences
  drm/i915/dg2: Classify DG2 PHY types
  drm/i915/dg2: Wait for SNPS PHY calibration during display init

Matthew Auld (1):
  drm/i915/xehp: Changes to ss/eu definitions

Paulo Zanoni (1):
  drm/i915: Fork DG1 interrupt handler

Prathap Kumar Valsan (1):
  drm/i915/xehp: New engine context offsets

Stuart Summers (2):
  drm/i915/xehp: Handle new device context ID format
  drm/i915/xehpsdv: Add compute DSS type

Tvrtko Ursulin (1):
  drm/i915/xehp: VDBOX/VEBOX fusing registers are enable-based

Venkata Sandeep Dhanalakota (1):
  drm/i915/gen12: Use fuse info to enable SFC

 drivers/gpu/drm/i915/Makefile |   1 +
 drivers/gpu/drm/i915/display/intel_bw.c   |  24 +-
 drivers/gpu/drm/i915/display/intel_cdclk.c|  24 +-
 drivers/gpu/drm/i915/display/intel_ddi.c  | 165 +++-
 drivers/gpu/drm/i915/display/intel_display.c  |  94 +-
 drivers/gpu/drm/i915/display/intel_display.h  |   1 +
 .../drm/i915/display/intel_display_debugfs.c  | 103 ++-
 .../drm/i915/display/intel_display_power.c|  25 +
 .../drm/i915/display/intel_display_power.h|  10 +
 .../drm/i915/display/intel_display_types.h|  18 +-
 drivers/gpu/drm/i915/display/intel_dp.c   |  23 +-
 drivers/gpu/drm/i915/display/intel_dpll.c |  12 +-
 drivers/gpu/drm/i915/display/intel_dpll_mgr.c |   5 +-
 drivers/gpu/drm/i915/display/intel_hdmi.c |  11 +
 drivers/gpu/drm/i915/display/intel_psr.c  |  10 +-
 drivers/gpu/drm/i915/display/intel_snps_phy.c | 862 ++
 drivers/gpu/drm/i915/display/intel_snps_phy.h |  35 +
 drivers/gpu/drm/i915/gt/debugfs_gt_pm.c   |   8 +-
 drivers/gpu/drm/i915/gt/gen8_engine_cs.c  |   7 +-
 drivers/gpu/drm/i915/gt/intel_engine_cs.c | 144 ++-
 drivers/gpu/drm/i915/gt/intel_engine_types.h  |  29 +-
 .../drm/i915/gt/intel_execlists_submission.c  |  78 +-
 drivers/gpu/drm/i915/gt/intel_gt.c|  66 +-
 drivers/gpu/drm/i915/gt/intel_gt.h   

Re: [git pull] drm for 5.14-rc1

2021-07-01 Thread pr-tracker-bot
The pull request you sent on Thu, 1 Jul 2021 14:34:15 +1000:

> git://anongit.freedesktop.org/drm/drm tags/drm-next-2021-07-01

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/e058a84bfddc42ba356a2316f2cf1141974625c9

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html


[PATCH v1 2/2] drm/i915/gem: Migrate to system at dma-buf attach time

2021-07-01 Thread Michael J. Ruhl
From: Thomas Hellström 

Until we support p2p dma or as a complement to that, migrate data
to system memory at dma-buf attach time if possible.

v2:
- Rebase on dynamic exporter. Update the igt_dmabuf_import_same_driver
  selftest to migrate if we are LMEM capable.
v3:
- Migrate also in the pin() callback.
v4:
- Migrate in attach

Signed-off-by: Thomas Hellström 
Signed-off-by: Michael J. Ruhl 
---
 drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c   | 12 +++-
 drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c |  4 +++-
 2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c 
b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
index ccae17d5f441..280291a4a9dc 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
@@ -170,9 +170,19 @@ static int i915_gem_dmabuf_attach(struct dma_buf *dmabuf,
  struct dma_buf_attachment *attach)
 {
struct drm_i915_gem_object *obj = dma_buf_to_obj(dmabuf);
+   int ret;
 
assert_object_held(obj);
-   return i915_gem_object_pin_pages(obj);
+
+   if (!i915_gem_object_can_migrate(obj, INTEL_REGION_SMEM))
+   return -EOPNOTSUPP;
+   ret = i915_gem_object_migrate(obj, NULL, INTEL_REGION_SMEM);
+   if (!ret)
+   ret = i915_gem_object_wait_migration(obj, 0);
+   if (!ret)
+   ret = i915_gem_object_pin_pages(obj);
+
+   return ret;
 }
 
 static void i915_gem_dmabuf_detach(struct dma_buf *dmabuf,
diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c 
b/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c
index 868b3469ecbd..b1e87ec08741 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c
@@ -106,7 +106,9 @@ static int igt_dmabuf_import_same_driver(void *arg)
int err;
 
force_different_devices = true;
-   obj = i915_gem_object_create_shmem(i915, PAGE_SIZE);
+   obj = i915_gem_object_create_lmem(i915, PAGE_SIZE, 0);
+   if (IS_ERR(obj))
+   obj = i915_gem_object_create_shmem(i915, PAGE_SIZE);
if (IS_ERR(obj))
goto out_ret;
 
-- 
2.31.1



[PATCH v1 1/2] drm/i915/gem: Correct the locking and pin pattern for dma-buf

2021-07-01 Thread Michael J. Ruhl
From: Thomas Hellström 

If our exported dma-bufs are imported by another instance of our driver,
that instance will typically have the imported dma-bufs locked during
dma_buf_map_attachment(). But the exporter also locks the same reservation
object in the map_dma_buf() callback, which leads to recursive locking.

So taking the lock inside _pin_pages_unlocked() is incorrect.

Additionally, the current pinning code path is contrary to the defined
way that pinning should occur.

Remove the explicit pin/unpin from the map/umap functions and move them
to the attach/detach allowing correct locking to occur, and to match
the static dma-buf drm_prime pattern.

Add a live selftest to exercise both dynamic and non-dynamic
exports.

v2:
- Extend the selftest with a fake dynamic importer.
- Provide real pin and unpin callbacks to not abuse the interface.
v3: (ruhl)
- Remove the dynamic export support and move the pinning into the
  attach/detach path.
v4: (ruhl)
- Put pages does not need to assert on the dma-resv

Reported-by: Michael J. Ruhl 
Signed-off-by: Thomas Hellström 
Signed-off-by: Michael J. Ruhl 
---
 drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c|  44 +--
 .../drm/i915/gem/selftests/i915_gem_dmabuf.c  | 116 +-
 2 files changed, 146 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c 
b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
index 616c3a2f1baf..ccae17d5f441 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c
@@ -12,6 +12,8 @@
 #include "i915_gem_object.h"
 #include "i915_scatterlist.h"
 
+I915_SELFTEST_DECLARE(static bool force_different_devices;)
+
 static struct drm_i915_gem_object *dma_buf_to_obj(struct dma_buf *buf)
 {
return to_intel_bo(buf->priv);
@@ -25,15 +27,11 @@ static struct sg_table *i915_gem_map_dma_buf(struct 
dma_buf_attachment *attachme
struct scatterlist *src, *dst;
int ret, i;
 
-   ret = i915_gem_object_pin_pages_unlocked(obj);
-   if (ret)
-   goto err;
-
/* Copy sg so that we make an independent mapping */
st = kmalloc(sizeof(struct sg_table), GFP_KERNEL);
if (st == NULL) {
ret = -ENOMEM;
-   goto err_unpin_pages;
+   goto err;
}
 
ret = sg_alloc_table(st, obj->mm.pages->nents, GFP_KERNEL);
@@ -58,8 +56,6 @@ static struct sg_table *i915_gem_map_dma_buf(struct 
dma_buf_attachment *attachme
sg_free_table(st);
 err_free:
kfree(st);
-err_unpin_pages:
-   i915_gem_object_unpin_pages(obj);
 err:
return ERR_PTR(ret);
 }
@@ -68,13 +64,9 @@ static void i915_gem_unmap_dma_buf(struct dma_buf_attachment 
*attachment,
   struct sg_table *sg,
   enum dma_data_direction dir)
 {
-   struct drm_i915_gem_object *obj = dma_buf_to_obj(attachment->dmabuf);
-
dma_unmap_sgtable(attachment->dev, sg, dir, DMA_ATTR_SKIP_CPU_SYNC);
sg_free_table(sg);
kfree(sg);
-
-   i915_gem_object_unpin_pages(obj);
 }
 
 static int i915_gem_dmabuf_vmap(struct dma_buf *dma_buf, struct dma_buf_map 
*map)
@@ -168,7 +160,32 @@ static int i915_gem_end_cpu_access(struct dma_buf 
*dma_buf, enum dma_data_direct
return err;
 }
 
+/**
+ * i915_gem_dmabuf_attach - Do any extra attach work necessary
+ * @dmabuf: imported dma-buf
+ * @attach: new attach to do work on
+ *
+ */
+static int i915_gem_dmabuf_attach(struct dma_buf *dmabuf,
+ struct dma_buf_attachment *attach)
+{
+   struct drm_i915_gem_object *obj = dma_buf_to_obj(dmabuf);
+
+   assert_object_held(obj);
+   return i915_gem_object_pin_pages(obj);
+}
+
+static void i915_gem_dmabuf_detach(struct dma_buf *dmabuf,
+  struct dma_buf_attachment *attach)
+{
+   struct drm_i915_gem_object *obj = dma_buf_to_obj(dmabuf);
+
+   i915_gem_object_unpin_pages(obj);
+}
+
 static const struct dma_buf_ops i915_dmabuf_ops =  {
+   .attach = i915_gem_dmabuf_attach,
+   .detach = i915_gem_dmabuf_detach,
.map_dma_buf = i915_gem_map_dma_buf,
.unmap_dma_buf = i915_gem_unmap_dma_buf,
.release = drm_gem_dmabuf_release,
@@ -204,6 +221,8 @@ static int i915_gem_object_get_pages_dmabuf(struct 
drm_i915_gem_object *obj)
struct sg_table *pages;
unsigned int sg_page_sizes;
 
+   assert_object_held(obj);
+
pages = dma_buf_map_attachment(obj->base.import_attach,
   DMA_BIDIRECTIONAL);
if (IS_ERR(pages))
@@ -241,7 +260,8 @@ struct drm_gem_object *i915_gem_prime_import(struct 
drm_device *dev,
if (dma_buf->ops == _dmabuf_ops) {
obj = dma_buf_to_obj(dma_buf);
/* is it from our device? */
-   if (obj->base.dev == dev) {
+   if (obj->base.dev == dev &&
+   

Re: [git pull] drm for 5.14-rc1

2021-07-01 Thread Linus Torvalds
On Wed, Jun 30, 2021 at 9:34 PM Dave Airlie  wrote:
>
> Hi Linus,
>
> This is the main drm pull request for 5.14-rc1.
>
> I've done a test pull into your current tree, and hit two conflicts
> (one in vc4, one in amdgpu), both seem pretty trivial, the amdgpu one
> is recent and sfr sent out a resolution for it today.

Well, the resolutions may be trivial, but the conflict made me look at
the code, and it's buggy.

Commit 04d8d73dbcbe ("drm/amdgpu: add common HMM get pages function")
is broken. It made the code do

mmap_read_lock(mm);
vma = find_vma(mm, start);
mmap_read_unlock(mm);

and then it *uses* that "vma" after it has dropped the lock.

That's a big no-no - once you've dropped the lock, the vma contents
simply aren't reliable any more. That mapping could now be unmapped
and removed at any time.

Now, the conflict actually made one of the uses go away (switching to
vma_lookup() means that the subsequent code no longer needs to look at
"vm_start" to verify we're actually _inside_ the vma), but it still
checks for vma->vm_file afterwards.

So those locking changes in commit 04d8d73dbcbe are completely bogus.

I tried to fix up that bug while handling the conflict, but who knows
what else similar is going on elsewhere.

So I would ask people to

 (a) verify that I didn't make things worse as I fixed things up (note
how I had to change the last argument to amdgpu_hmm_range_get_pages()
from false to true etc).

 (b) go and look at their vma lookup code: you can't just look up a
vma under the lock, and then drop the lock, and then think things stay
stable.

In particular for that (b) case: it is *NOT* enough to look up
vma->vm_file inside the lock and cache that. No - if the test is about
"no backing file before looking up pages", then you have to *keep*
holding the lock until after you've actually looked up the pages!

Because otherwise any test for "vma->vm_file" is entirely pointless,
for the same reason it's buggy to even look at it after dropping the
lock: because once you've dropped the lock, the thing you just tested
for might not be true any more.

So no, it's not valid to do

bool has_file = vma && vma->vm_file;

and then drop the lock, because you don't use 'vma' any more as a
pointer, and then use 'has_file' outside the lock. Because after
you've dropped the lock, 'has_file' is now meaningless.

So it's not just about "you can't look at vma->vm_file after dropping
the lock". It's more fundamental than that. Any *decision* you make
based on the vma is entirely pointless and moot after the lock is
dropped!

Did I fix it up correctly? Who knows. The code makes more sense to me
now and seems valid. But I really *really* want to stress how locking
is important.

You also can't just unlock in the middle of an operation - even if you
then take the lock *again* later (as amdgpu_hmm_range_get_pages() then
did), the fact that you unlocked in the middle means that all the
earlier tests you did are simply no longer valid when you re-take the
lock.

 Linus


Re: [Intel-gfx] [PULL] drm-intel-next-fixes

2021-07-01 Thread Rodrigo Vivi
On Thu, Jul 01, 2021 at 11:57:53AM +0300, Jani Nikula wrote:
> On Wed, 30 Jun 2021, Rodrigo Vivi  wrote:
> > On Wed, Jun 30, 2021 at 01:05:35PM +0300, Jani Nikula wrote:
> >> On Tue, 29 Jun 2021, Rodrigo Vivi  wrote:
> >> > Hi Dave and Daniel,
> >> >
> >> > Here goes drm-intel-next-fixes-2021-06-29:
> >> >
> >> > The biggest fix is the restoration of mmap ioctl for gen12 integrated 
> >> > parts
> >> > which lack was breaking ADL-P with media stack.
> >> > Besides that a small selftest fix and a theoretical overflow on
> >> > i915->pipe_to_crtc_mapping.
> >> 
> >> My last fixes pull for v5.13 fell between the cracks [1]. There was one
> >> stable worthy fix, but since it was still in drm-intel-fixes when you
> >> ran dim cherry-pick-next-fixes, it was skipped for drm-intel-next-fixes.
> >> 
> >> I've now dropped the commit and pushed v5.13 to drm-intel-fixes, as
> >> we're past that point. Subsequent dim cherry-pick-next-fixes should pick
> >> it up now.
> >
> > it didn't, probably because the Fixes hash not being part of the drm-next 
> > yet?!
> 
> Odd, should be.

indeed...

> 
> > I can cherry-pick that directly. Please let me know the commit id.
> 
> c88e2647c5bb ("drm/i915/display: Do not zero past infoframes.vsc")

pushed to drm-intel-next-queue... will wait for CI results and send another PR.
I hope there's still time, otherwise it can wait for the -fixes flow

> 
> Thanks,
> Jani.
> 
> 
> >
> > Thanks,
> > Rodrigo.
> >
> >> 
> >> Please do another next fixes pull request with that. (It's okay to pull
> >> this one already though, doesn't make a difference.)
> >> 
> >> 
> >> BR,
> >> Jani.
> >> 
> >> 
> >> [1] https://lore.kernel.org/r/87czsbu15r@intel.com
> >> 
> >> 
> >> 
> >> >
> >> > Thanks,
> >> > Rodrigo.
> >> >
> >> > The following changes since commit 
> >> > 1bd8a7dc28c1c410f1ceefae1f2a97c06d1a67c2:
> >> >
> >> >   Merge tag 'exynos-drm-next-for-v5.14' of 
> >> > git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos into 
> >> > drm-next (2021-06-11 14:19:12 +1000)
> >> >
> >> > are available in the Git repository at:
> >> >
> >> >   git://anongit.freedesktop.org/drm/drm-intel 
> >> > tags/drm-intel-next-fixes-2021-06-29
> >> >
> >> > for you to fetch changes up to c90c4c6574f3feaf2203b5671db1907a1e15c653:
> >> >
> >> >   drm/i915: Reinstate the mmap ioctl for some platforms (2021-06-28 
> >> > 07:43:56 -0400)
> >> >
> >> > 
> >> > The biggest fix is the restoration of mmap ioctl for gen12 integrated 
> >> > parts
> >> > which lack was breaking ADL-P with media stack.
> >> > Besides that a small selftest fix and a theoretical overflow on
> >> > i915->pipe_to_crtc_mapping.
> >> >
> >> > 
> >> > Chris Wilson (1):
> >> >   drm/i915/selftests: Reorder tasklet_disable vs local_bh_disable
> >> >
> >> > Jani Nikula (1):
> >> >   drm/i915/dsc: abstract helpers to get bigjoiner primary/secondary 
> >> > crtc
> >> >
> >> > Thomas Hellström (1):
> >> >   drm/i915: Reinstate the mmap ioctl for some platforms
> >> >
> >> >  drivers/gpu/drm/i915/display/intel_display.c   |  7 ++-
> >> >  drivers/gpu/drm/i915/display/intel_display_types.h |  8 
> >> >  drivers/gpu/drm/i915/display/intel_vdsc.c  | 40 +++-
> >> >  drivers/gpu/drm/i915/display/intel_vdsc.h  |  1 +
> >> >  drivers/gpu/drm/i915/gem/i915_gem_mman.c   |  7 +--
> >> >  drivers/gpu/drm/i915/gt/selftest_execlists.c   | 55 
> >> > +-
> >> >  6 files changed, 76 insertions(+), 42 deletions(-)
> >> 
> >> -- 
> >> Jani Nikula, Intel Open Source Graphics Center
> 
> -- 
> Jani Nikula, Intel Open Source Graphics Center
> ___
> Intel-gfx mailing list
> intel-...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 47/47] drm/i915/guc: Unblock GuC submission on Gen11+

2021-07-01 Thread Daniel Vetter
On Thu, Jul 1, 2021 at 8:27 PM Martin Peres  wrote:
>
> On 01/07/2021 11:14, Pekka Paalanen wrote:
> > On Wed, 30 Jun 2021 11:58:25 -0700
> > John Harrison  wrote:
> >
> >> On 6/30/2021 01:22, Martin Peres wrote:
> >>> On 24/06/2021 10:05, Matthew Brost wrote:
>  From: Daniele Ceraolo Spurio 
> 
>  Unblock GuC submission on Gen11+ platforms.
> 
>  Signed-off-by: Michal Wajdeczko 
>  Signed-off-by: Daniele Ceraolo Spurio 
>  Signed-off-by: Matthew Brost 
>  ---
> drivers/gpu/drm/i915/gt/uc/intel_guc.h|  1 +
> drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c |  8 
> drivers/gpu/drm/i915/gt/uc/intel_guc_submission.h |  3 +--
> drivers/gpu/drm/i915/gt/uc/intel_uc.c | 14 +-
> 4 files changed, 19 insertions(+), 7 deletions(-)
> 
> >
> > ...
> >
>  diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc.c
>  b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
>  index 7a69c3c027e9..61be0aa81492 100644
>  --- a/drivers/gpu/drm/i915/gt/uc/intel_uc.c
>  +++ b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
>  @@ -34,8 +34,15 @@ static void uc_expand_default_options(struct
>  intel_uc *uc)
> return;
> }
> -/* Default: enable HuC authentication only */
>  -i915->params.enable_guc = ENABLE_GUC_LOAD_HUC;
>  +/* Intermediate platforms are HuC authentication only */
>  +if (IS_DG1(i915) || IS_ALDERLAKE_S(i915)) {
>  +drm_dbg(>drm, "Disabling GuC only due to old
>  platform\n");
> >>>
> >>> This comment does not seem accurate, given that DG1 is barely out, and
> >>> ADL is not out yet. How about:
> >>>
> >>> "Disabling GuC on untested platforms"?
> >>>
> >> Just because something is not in the shops yet does not mean it is new.
> >> Technology is always obsolete by the time it goes on sale.
> >
> > That is a very good reason to not use terminology like "new", "old",
> > "current", "modern" etc. at all.
> >
> > End users like me definitely do not share your interpretation of "old".
>
> Yep, old and new is relative. In the end, what matters is the validation
> effort, which is why I was proposing "untested platforms".
>
> Also, remember that you are not writing these messages for Intel
> engineers, but instead are writing for Linux *users*.

It's drm_dbg. Users don't read this stuff, at least not users with no
clue what the driver does and stuff like that.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH 47/47] drm/i915/guc: Unblock GuC submission on Gen11+

2021-07-01 Thread Martin Peres

On 01/07/2021 11:14, Pekka Paalanen wrote:

On Wed, 30 Jun 2021 11:58:25 -0700
John Harrison  wrote:


On 6/30/2021 01:22, Martin Peres wrote:

On 24/06/2021 10:05, Matthew Brost wrote:

From: Daniele Ceraolo Spurio 

Unblock GuC submission on Gen11+ platforms.

Signed-off-by: Michal Wajdeczko 
Signed-off-by: Daniele Ceraolo Spurio 
Signed-off-by: Matthew Brost 
---
   drivers/gpu/drm/i915/gt/uc/intel_guc.h    |  1 +
   drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c |  8 
   drivers/gpu/drm/i915/gt/uc/intel_guc_submission.h |  3 +--
   drivers/gpu/drm/i915/gt/uc/intel_uc.c | 14 +-
   4 files changed, 19 insertions(+), 7 deletions(-)



...


diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc.c
b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
index 7a69c3c027e9..61be0aa81492 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_uc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
@@ -34,8 +34,15 @@ static void uc_expand_default_options(struct
intel_uc *uc)
   return;
   }
   -    /* Default: enable HuC authentication only */
-    i915->params.enable_guc = ENABLE_GUC_LOAD_HUC;
+    /* Intermediate platforms are HuC authentication only */
+    if (IS_DG1(i915) || IS_ALDERLAKE_S(i915)) {
+    drm_dbg(>drm, "Disabling GuC only due to old
platform\n");


This comment does not seem accurate, given that DG1 is barely out, and
ADL is not out yet. How about:

"Disabling GuC on untested platforms"?
  

Just because something is not in the shops yet does not mean it is new.
Technology is always obsolete by the time it goes on sale.


That is a very good reason to not use terminology like "new", "old",
"current", "modern" etc. at all.

End users like me definitely do not share your interpretation of "old".


Yep, old and new is relative. In the end, what matters is the validation 
effort, which is why I was proposing "untested platforms".


Also, remember that you are not writing these messages for Intel 
engineers, but instead are writing for Linux *users*.


Cheers,
Martin




Thanks,
pq



And the issue is not a lack of testing, it is a question of whether we
are allowed to change the default on something that has already started
being used by customers or not (including pre-release beta customers).
I.e. it is basically a political decision not an engineering decision.




Re: [PATCH 47/47] drm/i915/guc: Unblock GuC submission on Gen11+

2021-07-01 Thread Martin Peres

On 30/06/2021 21:00, Matthew Brost wrote:

On Wed, Jun 30, 2021 at 11:22:38AM +0300, Martin Peres wrote:



On 24/06/2021 10:05, Matthew Brost wrote:

From: Daniele Ceraolo Spurio 

Unblock GuC submission on Gen11+ platforms.

Signed-off-by: Michal Wajdeczko 
Signed-off-by: Daniele Ceraolo Spurio 
Signed-off-by: Matthew Brost 
---
   drivers/gpu/drm/i915/gt/uc/intel_guc.h|  1 +
   drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c |  8 
   drivers/gpu/drm/i915/gt/uc/intel_guc_submission.h |  3 +--
   drivers/gpu/drm/i915/gt/uc/intel_uc.c | 14 +-
   4 files changed, 19 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h 
b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
index fae01dc8e1b9..77981788204f 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
@@ -54,6 +54,7 @@ struct intel_guc {
struct ida guc_ids;
struct list_head guc_id_list;
+   bool submission_supported;
bool submission_selected;
struct i915_vma *ads_vma;
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index a427336ce916..405339202280 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -2042,6 +2042,13 @@ void intel_guc_submission_disable(struct intel_guc *guc)
/* Note: By the time we're here, GuC may have already been reset */
   }
+static bool __guc_submission_supported(struct intel_guc *guc)
+{
+   /* GuC submission is unavailable for pre-Gen11 */
+   return intel_guc_is_supported(guc) &&
+  INTEL_GEN(guc_to_gt(guc)->i915) >= 11;
+}
+
   static bool __guc_submission_selected(struct intel_guc *guc)
   {
struct drm_i915_private *i915 = guc_to_gt(guc)->i915;
@@ -2054,6 +2061,7 @@ static bool __guc_submission_selected(struct intel_guc 
*guc)
   void intel_guc_submission_init_early(struct intel_guc *guc)
   {
+   guc->submission_supported = __guc_submission_supported(guc);
guc->submission_selected = __guc_submission_selected(guc);
   }
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.h 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.h
index a2a3fad72be1..be767eb6ff71 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.h
@@ -37,8 +37,7 @@ int intel_guc_wait_for_pending_msg(struct intel_guc *guc,
   static inline bool intel_guc_submission_is_supported(struct intel_guc *guc)
   {
-   /* XXX: GuC submission is unavailable for now */
-   return false;
+   return guc->submission_supported;
   }
   static inline bool intel_guc_submission_is_wanted(struct intel_guc *guc)
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc.c 
b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
index 7a69c3c027e9..61be0aa81492 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_uc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_uc.c
@@ -34,8 +34,15 @@ static void uc_expand_default_options(struct intel_uc *uc)
return;
}
-   /* Default: enable HuC authentication only */
-   i915->params.enable_guc = ENABLE_GUC_LOAD_HUC;
+   /* Intermediate platforms are HuC authentication only */
+   if (IS_DG1(i915) || IS_ALDERLAKE_S(i915)) {
+   drm_dbg(>drm, "Disabling GuC only due to old platform\n");


This comment does not seem accurate, given that DG1 is barely out, and ADL
is not out yet. How about:

"Disabling GuC on untested platforms"?


This isn't my comment but it seems right to me. AFAIK this describes the
current PR but it is subject to change (i.e. we may enable GuC on DG1 by
default at some point).


Well, it's pretty bad PR to say that DG1 and ADL are old when they are 
not even out ;)


But seriously, fix this sentence, it makes no sense at all unless you 
are really trying to confuse non-native speakers (and annoy language 
purists too).







+   i915->params.enable_guc = ENABLE_GUC_LOAD_HUC;
+   return;
+   }
+
+   /* Default: enable HuC authentication and GuC submission */
+   i915->params.enable_guc = ENABLE_GUC_LOAD_HUC | ENABLE_GUC_SUBMISSION;


This seems to be in contradiction with the GuC submission plan which states:

"Not enabled by default on any current platforms but can be enabled via
modparam enable_guc".



I don't believe any current platform gets this point where GuC
submission would be enabled by default. The first would be ADL-P which
isn't out yet.


Isn't that exactly what the line above does?




When you rework the patch, could you please add a warning when the user
force-enables the GuC Command Submission? Something like:

"WARNING: The user force-enabled the experimental GuC command submission
backend using i915.enable_guc. Please disable it if experiencing stability
issues. No bug reports will be accepted on this backend".

This should allow you to work on the 

Re: [PATCH 0/4] mgag200: Various cleanups

2021-07-01 Thread Sam Ravnborg
Hi Thomas,

On Thu, Jul 01, 2021 at 02:43:12PM +0200, Thomas Zimmermann wrote:
> Cleanup several nits in the driver's init code. Also move constant
> data into the RO data segment. No functional changes.
> 
> Tested on mgag200 HW.
> 
> Thomas Zimmermann (4):
>   drm/mgag200: Don't pass flags to drm_dev_register()
>   drm/mgag200: Inline mgag200_device_init()
This patch drop a redundant error message too - it had helped me if the
changelog had said so but whatever.

>   drm/mgag200: Extract device type and flags in mgag200_pci_probe()
>   drm/mgag200: Constify LUT for programming bpp

Full serie is:
Acked-by: Sam Ravnborg 

Sam


[PATCH v5 2/2] drm/i915: Drop all references to DRM IRQ midlayer

2021-07-01 Thread Thomas Zimmermann
Remove all references to DRM's IRQ midlayer. i915 uses Linux' interrupt
functions directly.

v2:
* also remove an outdated comment
* move IRQ fix into separate patch
* update Fixes tag (Daniel)

Signed-off-by: Thomas Zimmermann 
Fixes: b318b82455bd ("drm/i915: Nuke drm_driver irq vfuncs")
Cc: Ville Syrjälä 
Cc: Chris Wilson 
Cc: Jani Nikula 
Cc: Joonas Lahtinen 
Cc: Rodrigo Vivi 
Cc: intel-...@lists.freedesktop.org
---
 drivers/gpu/drm/i915/i915_drv.c | 1 -
 drivers/gpu/drm/i915/i915_irq.c | 5 -
 2 files changed, 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 62327c15f457..30d8cd8c69b1 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -42,7 +42,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 2203dca19895..1d4c683c9de9 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -33,7 +33,6 @@
 #include 
 
 #include 
-#include 
 
 #include "display/intel_de.h"
 #include "display/intel_display_types.h"
@@ -4564,10 +4563,6 @@ void intel_runtime_pm_enable_interrupts(struct 
drm_i915_private *dev_priv)
 
 bool intel_irqs_enabled(struct drm_i915_private *dev_priv)
 {
-   /*
-* We only use drm_irq_uninstall() at unload and VT switch, so
-* this is the only thing we need to check.
-*/
return dev_priv->runtime_pm.irqs_enabled;
 }
 
-- 
2.32.0



[PATCH v5 0/2] drm/i915: IRQ fixes

2021-07-01 Thread Thomas Zimmermann
Fix a bug in the usage of IRQs and cleanup references to the DRM
IRQ midlayer.

Preferably this patchset would be merged through drm-misc-next.

v5:
* go back to _hardirq() after CI tests reported atomic
  context in PCI probe; add rsp comment
v4:
* switch IRQ code to intel_synchronize_irq() (Daniel)
v3:
* also use intel_synchronize_hardirq() from other callsite
v2:
* split patch
* also fix comment
* add intel_synchronize_hardirq() (Ville)
* update Fixes tag (Daniel)

Thomas Zimmermann (2):
  drm/i915: Use the correct IRQ during resume
  drm/i915: Drop all references to DRM IRQ midlayer

 drivers/gpu/drm/i915/gt/intel_engine_cs.c   |  2 +-
 drivers/gpu/drm/i915/gt/intel_ring_submission.c |  7 +--
 drivers/gpu/drm/i915/i915_drv.c |  1 -
 drivers/gpu/drm/i915/i915_irq.c | 10 +-
 drivers/gpu/drm/i915/i915_irq.h |  1 +
 5 files changed, 12 insertions(+), 9 deletions(-)


base-commit: 67f5a18128770817e4218a9e496d2bf5047c51e8
prerequisite-patch-id: c2b2f08f0eccc9f5df0c0da49fa1d36267deb11d
prerequisite-patch-id: c67e5d886a47b7d0266d81100837557fda34cb24
prerequisite-patch-id: 0cca17365e65370fa95d193ed2f1c88917ee1aef
prerequisite-patch-id: 12b9894350a0b56579d29542943465ef5134751c
prerequisite-patch-id: 3e1c37d3425f4820fe36ea3da57c65e166fe0ee5
prerequisite-patch-id: 1017c860a0bf95ce370d82b8db1745f5548fb321
prerequisite-patch-id: dcc022baab7c172978de9809702c2f4f54323047
prerequisite-patch-id: 0d05ee247042b43d5ab8f3af216e708a8e09bee8
prerequisite-patch-id: 110c411161bed6072c32185940fcd052d0bdb09a
prerequisite-patch-id: d2d1aeccffdfadf2b951487b8605f59c795d84cf
prerequisite-patch-id: 85fe31e27ca13adc0d1bcc7c19b1ce238a77ee6a
prerequisite-patch-id: c61fdacbe035ba5c17f1ff393bc9087f16aaea7b
prerequisite-patch-id: c4821af5dbba4d121769f1da85d91fbb53020ec0
prerequisite-patch-id: 0b20ef3302abfe6dc123dbc54b9dd087865f935b
prerequisite-patch-id: d34eb96cbbdeb91870ace4250ea75920b1653dc2
prerequisite-patch-id: 7f64fce347d15232134d7636ca7a8d9f5bf1a3a0
prerequisite-patch-id: c83be7a285eb6682cdae0df401ab5d4c208f036b
prerequisite-patch-id: eb1a44d2eb2685cea154dd3f17f5f463dfafd39a
prerequisite-patch-id: 92a8c37dae4b8394fd6702f4af58ac7815ac3069
prerequisite-patch-id: f0237988fe4ae6eba143432d1ace8beb52d935f8
prerequisite-patch-id: bcf4d29437ed7cb78225dec4c99249eb40c18302
prerequisite-patch-id: 6407b4c7f1b80af8d329d5f796b30da11959e936
prerequisite-patch-id: 4a69e6e49d691b555f0e0874d638cd204dcb0c48
prerequisite-patch-id: be09cfa8a67dd435a25103b85bd4b1649c5190a3
prerequisite-patch-id: 813ecc9f94251c3d669155faf64c0c9e6a458393
prerequisite-patch-id: beb2b5000a1682cbd74a7e2ab1566fcae5bccbf0
prerequisite-patch-id: 754c8878611864475a0b75fd49ff38e71a21c795
prerequisite-patch-id: d7d4bac3c19f94ba9593143b3c147d83d82cb71f
prerequisite-patch-id: 983d1efbe060743f5951e474961fa431d886d757
prerequisite-patch-id: 3c78b20c3b9315cd39e0ae9ea1510c6121bf9ca9
--
2.32.0



[PATCH v5 1/2] drm/i915: Use the correct IRQ during resume

2021-07-01 Thread Thomas Zimmermann
The code in xcs_resume() probably didn't work as intended. It uses
struct drm_device.irq, which is allocated to 0, but never initialized
by i915 to the device's interrupt number.

Change all calls to synchronize_hardirq() to intel_synchronize_irq(),
which uses the correct interrupt. _hardirq() functions are not needed
in this context.

v5:
* go back to _hardirq() after PCI probe reported wrong
  context; add rsp comment
v4:
* switch everything to intel_synchronize_irq() (Daniel)
v3:
* also use intel_synchronize_hardirq() at another callsite
v2:
* wrap irq code in intel_synchronize_hardirq() (Ville)

Signed-off-by: Thomas Zimmermann 
Fixes: 536f77b1caa0 ("drm/i915/gt: Call stop_ring() from ring resume, again")
Cc: Chris Wilson 
Cc: Mika Kuoppala 
Cc: Daniel Vetter 
Cc: Rodrigo Vivi 
Cc: Joonas Lahtinen 
Cc: Maarten Lankhorst 
Cc: Lucas De Marchi 
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c   | 2 +-
 drivers/gpu/drm/i915/gt/intel_ring_submission.c | 7 +--
 drivers/gpu/drm/i915/i915_irq.c | 5 +
 drivers/gpu/drm/i915/i915_irq.h | 1 +
 4 files changed, 12 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 88694822716a..5ca3d1664335 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -1229,7 +1229,7 @@ bool intel_engine_is_idle(struct intel_engine_cs *engine)
return true;
 
/* Waiting to drain ELSP? */
-   synchronize_hardirq(to_pci_dev(engine->i915->drm.dev)->irq);
+   intel_synchronize_hardirq(engine->i915);
intel_engine_flush_submission(engine);
 
/* ELSP is empty, but there are ready requests? E.g. after reset */
diff --git a/drivers/gpu/drm/i915/gt/intel_ring_submission.c 
b/drivers/gpu/drm/i915/gt/intel_ring_submission.c
index 5d42a12ef3d6..5c4d204d07cc 100644
--- a/drivers/gpu/drm/i915/gt/intel_ring_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_ring_submission.c
@@ -184,8 +184,11 @@ static int xcs_resume(struct intel_engine_cs *engine)
ENGINE_TRACE(engine, "ring:{HEAD:%04x, TAIL:%04x}\n",
 ring->head, ring->tail);
 
-   /* Double check the ring is empty & disabled before we resume */
-   synchronize_hardirq(engine->i915->drm.irq);
+   /*
+* Double check the ring is empty & disabled before we resume. Called
+* from atomic context during PCI probe, so _hardirq().
+*/
+   intel_synchronize_hardirq(engine->i915);
if (!stop_ring(engine))
goto err;
 
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 7d0ce8b9f8ed..2203dca19895 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -4575,3 +4575,8 @@ void intel_synchronize_irq(struct drm_i915_private *i915)
 {
synchronize_irq(to_pci_dev(i915->drm.dev)->irq);
 }
+
+void intel_synchronize_hardirq(struct drm_i915_private *i915)
+{
+   synchronize_hardirq(to_pci_dev(i915->drm.dev)->irq);
+}
diff --git a/drivers/gpu/drm/i915/i915_irq.h b/drivers/gpu/drm/i915/i915_irq.h
index db34d5dbe402..e43b6734f21b 100644
--- a/drivers/gpu/drm/i915/i915_irq.h
+++ b/drivers/gpu/drm/i915/i915_irq.h
@@ -94,6 +94,7 @@ void intel_runtime_pm_disable_interrupts(struct 
drm_i915_private *dev_priv);
 void intel_runtime_pm_enable_interrupts(struct drm_i915_private *dev_priv);
 bool intel_irqs_enabled(struct drm_i915_private *dev_priv);
 void intel_synchronize_irq(struct drm_i915_private *i915);
+void intel_synchronize_hardirq(struct drm_i915_private *i915);
 
 int intel_get_crtc_scanline(struct intel_crtc *crtc);
 void gen8_irq_power_well_post_enable(struct drm_i915_private *dev_priv,
-- 
2.32.0



[PATCH v2 1/2] drm/gud: Free buffers on device removal

2021-07-01 Thread Noralf Trønnes
Free transfer and compression buffers on device removal instead of at
DRM device removal time. This ensures that the usual 2x8MB buffers are
released when the device is unplugged and not kept around should
userspace keep the DRM device fd open.

At least Ubuntu 20.04 doesn't release the DRM device on unplug.

The damage_lock mutex is not destroyed because it is used outside the
drm_dev_enter/exit block in gud_pipe_update(). AFAICT it's possible for
an open fbdev descriptor to trigger a commit after the USB device is gone.

v2: Don't destroy damage_lock

Reviewed-by: Linus Walleij 
Signed-off-by: Noralf Trønnes 
---
 drivers/gpu/drm/gud/gud_drv.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/gud/gud_drv.c b/drivers/gpu/drm/gud/gud_drv.c
index e8b672dc9832..45427c73587f 100644
--- a/drivers/gpu/drm/gud/gud_drv.c
+++ b/drivers/gpu/drm/gud/gud_drv.c
@@ -394,14 +394,15 @@ static const struct drm_driver gud_drm_driver = {
.minor  = 0,
 };
 
-static void gud_free_buffers_and_mutex(struct drm_device *drm, void *unused)
+static void gud_free_buffers_and_mutex(void *data)
 {
-   struct gud_device *gdrm = to_gud_device(drm);
+   struct gud_device *gdrm = data;
 
vfree(gdrm->compress_buf);
+   gdrm->compress_buf = NULL;
kfree(gdrm->bulk_buf);
+   gdrm->bulk_buf = NULL;
mutex_destroy(>ctrl_lock);
-   mutex_destroy(>damage_lock);
 }
 
 static int gud_probe(struct usb_interface *intf, const struct usb_device_id 
*id)
@@ -455,7 +456,7 @@ static int gud_probe(struct usb_interface *intf, const 
struct usb_device_id *id)
INIT_WORK(>work, gud_flush_work);
gud_clear_damage(gdrm);
 
-   ret = drmm_add_action_or_reset(drm, gud_free_buffers_and_mutex, NULL);
+   ret = devm_add_action(dev, gud_free_buffers_and_mutex, gdrm);
if (ret)
return ret;
 
-- 
2.23.0



[PATCH v2 2/2] drm/gud: Use scatter-gather USB bulk transfer

2021-07-01 Thread Noralf Trønnes
There'a limit to how big a kmalloc buffer can be, and as memory gets
fragmented it becomes more difficult to get big buffers. The downside of
smaller buffers is that the driver has to split the transfer up which
hampers performance. Compression might also take a hit because of the
splitting.

Solve this by allocating the transfer buffer using vmalloc and create a
SG table to be passed on to the USB subsystem. vmalloc_32() is used to
avoid DMA bounce buffers on USB controllers that can only access 32-bit
addresses.

This also solves the problem that split transfers can give host side
tearing since flushing is decoupled from rendering.

usb_sg_wait() doesn't have timeout handling builtin, so it is wrapped in
a timer like 4 out of 6 users in the kernel have done.

v2:
- Use DIV_ROUND_UP (Linus)
- Add timeout note to the commit log (Linus)
- Expand note about upper buffer limit (Linus)
- Change var name s/timer/ctx/ in gud_usb_bulk_timeout()

Reviewed-by: Linus Walleij 
Signed-off-by: Noralf Trønnes 
---
 drivers/gpu/drm/gud/gud_drv.c  | 50 +-
 drivers/gpu/drm/gud/gud_internal.h |  2 ++
 drivers/gpu/drm/gud/gud_pipe.c | 47 
 3 files changed, 78 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/gud/gud_drv.c b/drivers/gpu/drm/gud/gud_drv.c
index 45427c73587f..b39a54f17063 100644
--- a/drivers/gpu/drm/gud/gud_drv.c
+++ b/drivers/gpu/drm/gud/gud_drv.c
@@ -394,13 +394,40 @@ static const struct drm_driver gud_drm_driver = {
.minor  = 0,
 };
 
+static int gud_alloc_bulk_buffer(struct gud_device *gdrm)
+{
+   unsigned int i, num_pages;
+   struct page **pages;
+   void *ptr;
+   int ret;
+
+   gdrm->bulk_buf = vmalloc_32(gdrm->bulk_len);
+   if (!gdrm->bulk_buf)
+   return -ENOMEM;
+
+   num_pages = DIV_ROUND_UP(gdrm->bulk_len, PAGE_SIZE);
+   pages = kmalloc_array(num_pages, sizeof(struct page *), GFP_KERNEL);
+   if (!pages)
+   return -ENOMEM;
+
+   for (i = 0, ptr = gdrm->bulk_buf; i < num_pages; i++, ptr += PAGE_SIZE)
+   pages[i] = vmalloc_to_page(ptr);
+
+   ret = sg_alloc_table_from_pages(>bulk_sgt, pages, num_pages,
+   0, gdrm->bulk_len, GFP_KERNEL);
+   kfree(pages);
+
+   return ret;
+}
+
 static void gud_free_buffers_and_mutex(void *data)
 {
struct gud_device *gdrm = data;
 
vfree(gdrm->compress_buf);
gdrm->compress_buf = NULL;
-   kfree(gdrm->bulk_buf);
+   sg_free_table(>bulk_sgt);
+   vfree(gdrm->bulk_buf);
gdrm->bulk_buf = NULL;
mutex_destroy(>ctrl_lock);
 }
@@ -537,24 +564,17 @@ static int gud_probe(struct usb_interface *intf, const 
struct usb_device_id *id)
 
if (desc.max_buffer_size)
max_buffer_size = le32_to_cpu(desc.max_buffer_size);
-retry:
-   /*
-* Use plain kmalloc here since devm_kmalloc() places struct devres at 
the beginning
-* of the buffer it allocates. This wastes a lot of memory when 
allocating big buffers.
-* Asking for 2M would actually allocate 4M. This would also prevent 
getting the biggest
-* possible buffer potentially leading to split transfers.
-*/
-   gdrm->bulk_buf = kmalloc(max_buffer_size, GFP_KERNEL | __GFP_NOWARN);
-   if (!gdrm->bulk_buf) {
-   max_buffer_size = roundup_pow_of_two(max_buffer_size) / 2;
-   if (max_buffer_size < SZ_512K)
-   return -ENOMEM;
-   goto retry;
-   }
+   /* Prevent a misbehaving device from allocating loads of RAM. 
4096x4096@XRGB = 64 MB */
+   if (max_buffer_size > SZ_64M)
+   max_buffer_size = SZ_64M;
 
gdrm->bulk_pipe = usb_sndbulkpipe(interface_to_usbdev(intf), 
usb_endpoint_num(bulk_out));
gdrm->bulk_len = max_buffer_size;
 
+   ret = gud_alloc_bulk_buffer(gdrm);
+   if (ret)
+   return ret;
+
if (gdrm->compression & GUD_COMPRESSION_LZ4) {
gdrm->lz4_comp_mem = devm_kmalloc(dev, LZ4_MEM_COMPRESS, 
GFP_KERNEL);
if (!gdrm->lz4_comp_mem)
diff --git a/drivers/gpu/drm/gud/gud_internal.h 
b/drivers/gpu/drm/gud/gud_internal.h
index de2f2d2dbc60..1bb65a46c347 100644
--- a/drivers/gpu/drm/gud/gud_internal.h
+++ b/drivers/gpu/drm/gud/gud_internal.h
@@ -5,6 +5,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -26,6 +27,7 @@ struct gud_device {
unsigned int bulk_pipe;
void *bulk_buf;
size_t bulk_len;
+   struct sg_table bulk_sgt;
 
u8 compression;
void *lz4_comp_mem;
diff --git a/drivers/gpu/drm/gud/gud_pipe.c b/drivers/gpu/drm/gud/gud_pipe.c
index 2f83ab6b8e61..e0fb6cc969a3 100644
--- a/drivers/gpu/drm/gud/gud_pipe.c
+++ b/drivers/gpu/drm/gud/gud_pipe.c
@@ -220,13 +220,51 @@ static int gud_prep_flush(struct gud_device *gdrm, struct 
drm_framebuffer *fb,
return ret;
 

[PATCH] drm/fourcc: Add modifier definitions for Arm Fixed Rate Compression

2021-07-01 Thread Normunds Rieksts
Arm Fixed Rate Compression (AFRC) is a proprietary fixed rate image
compression protocol and format.
It is designed to provide guaranteed bandwidth and memory footprint
reductions in graphics and media use-cases.

This patch aims to add modifier definitions for describing
AFRC.

Signed-off-by: Normunds Rieksts 
---
 include/uapi/drm/drm_fourcc.h | 109 +-
 1 file changed, 106 insertions(+), 3 deletions(-)

diff --git a/include/uapi/drm/drm_fourcc.h b/include/uapi/drm/drm_fourcc.h
index f7156322aba5..9f4bb4a6f358 100644
--- a/include/uapi/drm/drm_fourcc.h
+++ b/include/uapi/drm/drm_fourcc.h
@@ -900,9 +900,9 @@ drm_fourcc_canonicalize_nvidia_format_mod(__u64 modifier)
 
 /*
  * The top 4 bits (out of the 56 bits alloted for specifying vendor specific
- * modifiers) denote the category for modifiers. Currently we have only two
- * categories of modifiers ie AFBC and MISC. We can have a maximum of sixteen
- * different categories.
+ * modifiers) denote the category for modifiers. Currently we have three
+ * categories of modifiers ie AFBC, MISC and AFRC. We can have a maximum of
+ * sixteen different categories.
  */
 #define DRM_FORMAT_MOD_ARM_CODE(__type, __val) \
fourcc_mod_code(ARM, ((__u64)(__type) << 52) | ((__val) & 
0x000fULL))
@@ -1017,6 +1017,109 @@ drm_fourcc_canonicalize_nvidia_format_mod(__u64 
modifier)
  */
 #define AFBC_FORMAT_MOD_USM(1ULL << 12)
 
+/*
+ * Arm Fixed-Rate Compression (AFRC) modifiers
+ *
+ * AFRC is a proprietary fixed rate image compression protocol and format,
+ * designed to provide guaranteed bandwidth and memory footprint
+ * reductions in graphics and media use-cases.
+ *
+ * AFRC buffers consist of one or more planes, with the same components
+ * and meaning as an uncompressed buffer using the same pixel format.
+ *
+ * Within each plane, the pixel/luma/chroma values are grouped into
+ * "coding unit" blocks which are individually compressed to a
+ * fixed size (in bytes). All coding units within a given plane of a buffer
+ * store the same number of values, and have the same compressed size.
+ *
+ * The coding unit size is configurable, allowing different rates of 
compression.
+ *
+ * The start of each AFRC buffer plane must be aligned to an alignment granule 
which
+ * depends on the coding unit size.
+ *
+ * Coding Unit Size   Plane Alignment
+ *    ---
+ * 16 bytes   1024 bytes
+ * 24 bytes   512  bytes
+ * 32 bytes   2048 bytes
+ *
+ * Coding units are grouped into paging tiles. AFRC buffer dimensions must be 
aligned
+ * to a multiple of the paging tile dimensions.
+ * The dimensions of each paging tile depend on whether the buffer is 
optimised for
+ * scanline (SCAN layout) or rotated (ROT layout) access.
+ *
+ * Layout   Paging Tile Width   Paging Tile Height
+ * --   -   --
+ * SCAN 16 coding units 4 coding units
+ * ROT  8  coding units 8 coding units
+ *
+ * The dimensions of each coding unit depend on the number of components
+ * in the compressed plane and whether the buffer is optimised for
+ * scanline (SCAN layout) or rotated (ROT layout) access.
+ *
+ * Number of Components in Plane   Layout  Coding Unit Width   Coding Unit 
Height
+ * -   -   -   
--
+ * 1   SCAN16 samples  4 samples
+ * Example: 16x4 luma samples in a 'Y' plane
+ *  16x4 chroma 'V' values, in the 'V' plane of a fully-planar YUV 
buffer
+ * -   -   -   
--
+ * 1   ROT 8 samples   8 samples
+ * Example: 8x8 luma samples in a 'Y' plane
+ *  8x8 chroma 'V' values, in the 'V' plane of a fully-planar YUV 
buffer
+ * -   -   -   
--
+ * 2   DONT CARE   8 samples   4 samples
+ * Example: 8x4 chroma pairs in the 'UV' plane of a semi-planar YUV buffer
+ * -   -   -   
--
+ * 3   DONT CARE   4 samples   4 samples
+ * Example: 4x4 pixels in an RGB buffer without alpha
+ * -   -   -   
--
+ * 4   DONT CARE   4 samples   4 samples
+ * Example: 4x4 pixels in an RGB buffer with alpha
+ */
+
+#define DRM_FORMAT_MOD_ARM_TYPE_AFRC 0x02
+
+#define DRM_FORMAT_MOD_ARM_AFRC(__afrc_mode) \
+   DRM_FORMAT_MOD_ARM_CODE(DRM_FORMAT_MOD_ARM_TYPE_AFRC, __afrc_mode)
+
+/*
+ * AFRC coding unit size modifier.
+ *
+ * Indicates the number of bytes used to store each compressed coding unit for
+ * one or more planes in an AFRC encoded buffer. The coding unit size for 
chrominance
+ * is the same for both Cb and 

[PATCH 3/7] drm/i915/guc: Increase size of CTB buffers

2021-07-01 Thread Matthew Brost
With the introduction of non-blocking CTBs more than one CTB can be in
flight at a time. Increasing the size of the CTBs should reduce how
often software hits the case where no space is available in the CTB
buffer.

Cc: John Harrison 
Signed-off-by: Matthew Brost 
Reviewed-by: Michal Wajdeczko 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
index 80db59b45c45..43e03aa2dde8 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
@@ -58,11 +58,16 @@ static inline struct drm_device *ct_to_drm(struct 
intel_guc_ct *ct)
  *  ++---+--+
  *
  * Size of each `CT Buffer`_ must be multiple of 4K.
- * As we don't expect too many messages, for now use minimum sizes.
+ * We don't expect too many messages in flight at any time, unless we are
+ * using the GuC submission. In that case each request requires a minimum
+ * 2 dwords which gives us a maximum 256 queue'd requests. Hopefully this
+ * enough space to avoid backpressure on the driver. We increase the size
+ * of the receive buffer (relative to the send) to ensure a G2H response
+ * CTB has a landing spot.
  */
 #define CTB_DESC_SIZE  ALIGN(sizeof(struct guc_ct_buffer_desc), SZ_2K)
 #define CTB_H2G_BUFFER_SIZE(SZ_4K)
-#define CTB_G2H_BUFFER_SIZE(SZ_4K)
+#define CTB_G2H_BUFFER_SIZE(4 * CTB_H2G_BUFFER_SIZE)
 
 struct ct_request {
struct list_head link;
@@ -643,7 +648,7 @@ static int ct_read(struct intel_guc_ct *ct, struct 
ct_incoming_msg **msg)
/* beware of buffer wrap case */
if (unlikely(available < 0))
available += size;
-   CT_DEBUG(ct, "available %d (%u:%u)\n", available, head, tail);
+   CT_DEBUG(ct, "available %d (%u:%u:%u)\n", available, head, tail, size);
GEM_BUG_ON(available < 0);
 
header = cmds[head];
-- 
2.28.0



[PATCH 4/7] drm/i915/guc: Add non blocking CTB send function

2021-07-01 Thread Matthew Brost
Add non blocking CTB send function, intel_guc_send_nb. GuC submission
will send CTBs in the critical path and does not need to wait for these
CTBs to complete before moving on, hence the need for this new function.

The non-blocking CTB now must have a flow control mechanism to ensure
the buffer isn't overrun. A lazy spin wait is used as we believe the
flow control condition should be rare with a properly sized buffer.

The function, intel_guc_send_nb, is exported in this patch but unused.
Several patches later in the series make use of this function.

v2:
 (Michal)
  - Use define for H2G room calculations
  - Move INTEL_GUC_SEND_NB define
 (Daniel Vetter)
  - Use msleep_interruptible rather than cond_resched
v3:
 (Michal)
  - Move includes to following patch
  - s/INTEL_GUC_SEND_NB/INTEL_GUC_CT_SEND_NB/g

Signed-off-by: John Harrison 
Signed-off-by: Matthew Brost 
---
 .../gt/uc/abi/guc_communication_ctb_abi.h |  3 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc.h| 11 ++-
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 87 +--
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h |  4 +-
 4 files changed, 91 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h 
b/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h
index e933ca02d0eb..99e1fad5ca20 100644
--- a/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h
+++ b/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h
@@ -79,7 +79,8 @@ static_assert(sizeof(struct guc_ct_buffer_desc) == 64);
  *  
+---+---+--+
  */
 
-#define GUC_CTB_MSG_MIN_LEN1u
+#define GUC_CTB_HDR_LEN1u
+#define GUC_CTB_MSG_MIN_LENGUC_CTB_HDR_LEN
 #define GUC_CTB_MSG_MAX_LEN256u
 #define GUC_CTB_MSG_0_FENCE(0x << 16)
 #define GUC_CTB_MSG_0_FORMAT   (0xf << 12)
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h 
b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
index 4abc59f6f3cd..72e4653222e2 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
@@ -74,7 +74,14 @@ static inline struct intel_guc *log_to_guc(struct 
intel_guc_log *log)
 static
 inline int intel_guc_send(struct intel_guc *guc, const u32 *action, u32 len)
 {
-   return intel_guc_ct_send(>ct, action, len, NULL, 0);
+   return intel_guc_ct_send(>ct, action, len, NULL, 0, 0);
+}
+
+static
+inline int intel_guc_send_nb(struct intel_guc *guc, const u32 *action, u32 len)
+{
+   return intel_guc_ct_send(>ct, action, len, NULL, 0,
+INTEL_GUC_CT_SEND_NB);
 }
 
 static inline int
@@ -82,7 +89,7 @@ intel_guc_send_and_receive(struct intel_guc *guc, const u32 
*action, u32 len,
   u32 *response_buf, u32 response_buf_size)
 {
return intel_guc_ct_send(>ct, action, len,
-response_buf, response_buf_size);
+response_buf, response_buf_size, 0);
 }
 
 static inline void intel_guc_to_host_event_handler(struct intel_guc *guc)
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
index 43e03aa2dde8..fb825cc1d090 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
@@ -3,6 +3,8 @@
  * Copyright © 2016-2019 Intel Corporation
  */
 
+#include 
+
 #include "i915_drv.h"
 #include "intel_guc_ct.h"
 #include "gt/intel_gt.h"
@@ -373,7 +375,7 @@ static void write_barrier(struct intel_guc_ct *ct)
 static int ct_write(struct intel_guc_ct *ct,
const u32 *action,
u32 len /* in dwords */,
-   u32 fence)
+   u32 fence, u32 flags)
 {
struct intel_guc_ct_buffer *ctb = >ctbs.send;
struct guc_ct_buffer_desc *desc = ctb->desc;
@@ -409,7 +411,7 @@ static int ct_write(struct intel_guc_ct *ct,
used = tail - head;
 
/* make sure there is a space including extra dw for the fence */
-   if (unlikely(used + len + 1 >= size))
+   if (unlikely(used + len + GUC_CTB_HDR_LEN >= size))
return -ENOSPC;
 
/*
@@ -421,9 +423,13 @@ static int ct_write(struct intel_guc_ct *ct,
 FIELD_PREP(GUC_CTB_MSG_0_NUM_DWORDS, len) |
 FIELD_PREP(GUC_CTB_MSG_0_FENCE, fence);
 
-   hxg = FIELD_PREP(GUC_HXG_MSG_0_TYPE, GUC_HXG_TYPE_REQUEST) |
- FIELD_PREP(GUC_HXG_REQUEST_MSG_0_ACTION |
-GUC_HXG_REQUEST_MSG_0_DATA0, action[0]);
+   hxg = (flags & INTEL_GUC_CT_SEND_NB) ?
+   (FIELD_PREP(GUC_HXG_MSG_0_TYPE, GUC_HXG_TYPE_EVENT) |
+FIELD_PREP(GUC_HXG_EVENT_MSG_0_ACTION |
+   GUC_HXG_EVENT_MSG_0_DATA0, action[0])) :
+   (FIELD_PREP(GUC_HXG_MSG_0_TYPE, GUC_HXG_TYPE_REQUEST) |
+ 

[PATCH 6/7] drm/i915/guc: Optimize CTB writes and reads

2021-07-01 Thread Matthew Brost
CTB writes are now in the path of command submission and should be
optimized for performance. Rather than reading CTB descriptor values
(e.g. head, tail) which could result in accesses across the PCIe bus,
store shadow local copies and only read/write the descriptor values when
absolutely necessary. Also store the current space in the each channel
locally.

v2:
 (Michel)
  - Add additional sanity checks for head / tail pointers
  - Use GUC_CTB_HDR_LEN rather than magic 1

Signed-off-by: John Harrison 
Signed-off-by: Matthew Brost 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 88 +++
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h |  6 ++
 2 files changed, 65 insertions(+), 29 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
index a9cb7b608520..5b8b4ff609e2 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
@@ -130,6 +130,10 @@ static void guc_ct_buffer_desc_init(struct 
guc_ct_buffer_desc *desc)
 static void guc_ct_buffer_reset(struct intel_guc_ct_buffer *ctb)
 {
ctb->broken = false;
+   ctb->tail = 0;
+   ctb->head = 0;
+   ctb->space = CIRC_SPACE(ctb->tail, ctb->head, ctb->size);
+
guc_ct_buffer_desc_init(ctb->desc);
 }
 
@@ -383,10 +387,8 @@ static int ct_write(struct intel_guc_ct *ct,
 {
struct intel_guc_ct_buffer *ctb = >ctbs.send;
struct guc_ct_buffer_desc *desc = ctb->desc;
-   u32 head = desc->head;
-   u32 tail = desc->tail;
+   u32 tail = ctb->tail;
u32 size = ctb->size;
-   u32 used;
u32 header;
u32 hxg;
u32 *cmds = ctb->cmds;
@@ -395,25 +397,22 @@ static int ct_write(struct intel_guc_ct *ct,
if (unlikely(desc->status))
goto corrupted;
 
-   if (unlikely((tail | head) >= size)) {
+   GEM_BUG_ON(tail > size);
+
+#ifdef CONFIG_DRM_I915_DEBUG_GUC
+   if (unlikely(tail != READ_ONCE(desc->tail))) {
+   CT_ERROR(ct, "Tail was modified %u != %u\n",
+desc->tail, ctb->tail);
+   desc->status |= GUC_CTB_STATUS_MISMATCH;
+   goto corrupted;
+   }
+   if (unlikely((desc->tail | desc->head) >= size)) {
CT_ERROR(ct, "Invalid offsets head=%u tail=%u (size=%u)\n",
-head, tail, size);
+desc->head, desc->tail, size);
desc->status |= GUC_CTB_STATUS_OVERFLOW;
goto corrupted;
}
-
-   /*
-* tail == head condition indicates empty. GuC FW does not support
-* using up the entire buffer to get tail == head meaning full.
-*/
-   if (tail < head)
-   used = (size - head) + tail;
-   else
-   used = tail - head;
-
-   /* make sure there is a space including extra dw for the fence */
-   if (unlikely(used + len + GUC_CTB_HDR_LEN >= size))
-   return -ENOSPC;
+#endif
 
/*
 * dw0: CT header (including fence)
@@ -454,7 +453,9 @@ static int ct_write(struct intel_guc_ct *ct,
write_barrier(ct);
 
/* now update descriptor */
+   ctb->tail = tail;
WRITE_ONCE(desc->tail, tail);
+   ctb->space -= len + GUC_CTB_HDR_LEN;
 
return 0;
 
@@ -470,7 +471,7 @@ static int ct_write(struct intel_guc_ct *ct,
  * @req:   pointer to pending request
  * @status:placeholder for status
  *
- * For each sent request, Guc shall send bac CT response message.
+ * For each sent request, GuC shall send back CT response message.
  * Our message handler will update status of tracked request once
  * response message with given fence is received. Wait here and
  * check for valid response status value.
@@ -526,24 +527,35 @@ static inline bool ct_deadlocked(struct intel_guc_ct *ct)
return ret;
 }
 
-static inline bool h2g_has_room(struct intel_guc_ct_buffer *ctb, u32 len_dw)
+static inline bool h2g_has_room(struct intel_guc_ct *ct, u32 len_dw)
 {
-   struct guc_ct_buffer_desc *desc = ctb->desc;
-   u32 head = READ_ONCE(desc->head);
+   struct intel_guc_ct_buffer *ctb = >ctbs.send;
+   u32 head;
u32 space;
 
-   space = CIRC_SPACE(desc->tail, head, ctb->size);
+   if (ctb->space >= len_dw)
+   return true;
+
+   head = READ_ONCE(ctb->desc->head);
+   if (unlikely(head > ctb->size)) {
+   CT_ERROR(ct, "Corrupted descriptor head=%u tail=%u size=%u\n",
+ctb->desc->head, ctb->desc->tail, ctb->size);
+   ctb->desc->status |= GUC_CTB_STATUS_OVERFLOW;
+   ctb->broken = true;
+   return false;
+   }
+
+   space = CIRC_SPACE(ctb->tail, head, ctb->size);
+   ctb->space = space;
 
return space >= len_dw;
 }
 
 static int has_room_nb(struct intel_guc_ct *ct, u32 len_dw)
 {
-   struct intel_guc_ct_buffer *ctb = >ctbs.send;
-

[PATCH 7/7] drm/i915/guc: Module load failure test for CT buffer creation

2021-07-01 Thread Matthew Brost
From: John Harrison 

Add several module failure load inject points in the CT buffer creation
code path.

Signed-off-by: John Harrison 
Signed-off-by: Matthew Brost 
Reviewed-by: Michal Wajdeczko 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
index 5b8b4ff609e2..d2a55521ef25 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
@@ -175,6 +175,10 @@ static int ct_register_buffer(struct intel_guc_ct *ct, u32 
type,
 {
int err;
 
+   err = i915_inject_probe_error(guc_to_gt(ct_to_guc(ct))->i915, -ENXIO);
+   if (unlikely(err))
+   return err;
+
err = guc_action_register_ct_buffer(ct_to_guc(ct), type,
desc_addr, buff_addr, size);
if (unlikely(err))
@@ -226,6 +230,10 @@ int intel_guc_ct_init(struct intel_guc_ct *ct)
u32 *cmds;
int err;
 
+   err = i915_inject_probe_error(guc_to_gt(guc)->i915, -ENXIO);
+   if (err)
+   return err;
+
GEM_BUG_ON(ct->vma);
 
blob_size = 2 * CTB_DESC_SIZE + CTB_H2G_BUFFER_SIZE + 
CTB_G2H_BUFFER_SIZE;
-- 
2.28.0



[PATCH 2/7] drm/i915/guc: Improve error message for unsolicited CT response

2021-07-01 Thread Matthew Brost
Improve the error message when a unsolicited CT response is received by
printing fence that couldn't be found, the last fence, and all requests
with a response outstanding.

Signed-off-by: Matthew Brost 
Reviewed-by: Michal Wajdeczko 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
index b86575b99537..80db59b45c45 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
@@ -732,12 +732,16 @@ static int ct_handle_response(struct intel_guc_ct *ct, 
struct ct_incoming_msg *r
found = true;
break;
}
-   spin_unlock_irqrestore(>requests.lock, flags);
-
if (!found) {
CT_ERROR(ct, "Unsolicited response (fence %u)\n", fence);
-   return -ENOKEY;
+   CT_ERROR(ct, "Could not find fence=%u, last_fence=%u\n", fence,
+ct->requests.last_fence);
+   list_for_each_entry(req, >requests.pending, link)
+   CT_ERROR(ct, "request %u awaits response\n",
+req->fence);
+   err = -ENOKEY;
}
+   spin_unlock_irqrestore(>requests.lock, flags);
 
if (unlikely(err))
return err;
-- 
2.28.0



[PATCH 5/7] drm/i915/guc: Add stall timer to non blocking CTB send function

2021-07-01 Thread Matthew Brost
Implement a stall timer which fails H2G CTBs once a period of time
with no forward progress is reached to prevent deadlock.

v2:
 (Michal)
  - Improve error message in ct_deadlock()
  - Set broken when ct_deadlock() returns true
  - Return -EPIPE on ct_deadlock()
v3:
 (Michal)
  - Add ms to stall timer comment
 (Matthew)
  - Move broken check to intel_guc_ct_send()

Signed-off-by: John Harrison 
Signed-off-by: Daniele Ceraolo Spurio 
Signed-off-by: Matthew Brost 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 62 ---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h |  4 ++
 2 files changed, 59 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
index fb825cc1d090..a9cb7b608520 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
@@ -4,6 +4,9 @@
  */
 
 #include 
+#include 
+#include 
+#include 
 
 #include "i915_drv.h"
 #include "intel_guc_ct.h"
@@ -316,6 +319,7 @@ int intel_guc_ct_enable(struct intel_guc_ct *ct)
goto err_deregister;
 
ct->enabled = true;
+   ct->stall_time = KTIME_MAX;
 
return 0;
 
@@ -388,9 +392,6 @@ static int ct_write(struct intel_guc_ct *ct,
u32 *cmds = ctb->cmds;
unsigned int i;
 
-   if (unlikely(ctb->broken))
-   return -EPIPE;
-
if (unlikely(desc->status))
goto corrupted;
 
@@ -506,6 +507,25 @@ static int wait_for_ct_request_update(struct ct_request 
*req, u32 *status)
return err;
 }
 
+#define GUC_CTB_TIMEOUT_MS 1500
+static inline bool ct_deadlocked(struct intel_guc_ct *ct)
+{
+   long timeout = GUC_CTB_TIMEOUT_MS;
+   bool ret = ktime_ms_delta(ktime_get(), ct->stall_time) > timeout;
+
+   if (unlikely(ret)) {
+   struct guc_ct_buffer_desc *send = ct->ctbs.send.desc;
+   struct guc_ct_buffer_desc *recv = ct->ctbs.send.desc;
+
+   CT_ERROR(ct, "Communication stalled for %lld ms, desc 
status=%#x,%#x\n",
+ktime_ms_delta(ktime_get(), ct->stall_time),
+send->status, recv->status);
+   ct->ctbs.send.broken = true;
+   }
+
+   return ret;
+}
+
 static inline bool h2g_has_room(struct intel_guc_ct_buffer *ctb, u32 len_dw)
 {
struct guc_ct_buffer_desc *desc = ctb->desc;
@@ -517,6 +537,26 @@ static inline bool h2g_has_room(struct intel_guc_ct_buffer 
*ctb, u32 len_dw)
return space >= len_dw;
 }
 
+static int has_room_nb(struct intel_guc_ct *ct, u32 len_dw)
+{
+   struct intel_guc_ct_buffer *ctb = >ctbs.send;
+
+   lockdep_assert_held(>ctbs.send.lock);
+
+   if (unlikely(!h2g_has_room(ctb, len_dw))) {
+   if (ct->stall_time == KTIME_MAX)
+   ct->stall_time = ktime_get();
+
+   if (unlikely(ct_deadlocked(ct)))
+   return -EPIPE;
+   else
+   return -EBUSY;
+   }
+
+   ct->stall_time = KTIME_MAX;
+   return 0;
+}
+
 static int ct_send_nb(struct intel_guc_ct *ct,
  const u32 *action,
  u32 len,
@@ -529,11 +569,9 @@ static int ct_send_nb(struct intel_guc_ct *ct,
 
spin_lock_irqsave(>lock, spin_flags);
 
-   ret = h2g_has_room(ctb, len + GUC_CTB_HDR_LEN);
-   if (unlikely(!ret)) {
-   ret = -EBUSY;
+   ret = has_room_nb(ct, len + GUC_CTB_HDR_LEN);
+   if (unlikely(ret))
goto out;
-   }
 
fence = ct_get_next_fence(ct);
ret = ct_write(ct, action, len, fence, flags);
@@ -576,8 +614,13 @@ static int ct_send(struct intel_guc_ct *ct,
 retry:
spin_lock_irqsave(>lock, flags);
if (unlikely(!h2g_has_room(ctb, len + GUC_CTB_HDR_LEN))) {
+   if (ct->stall_time == KTIME_MAX)
+   ct->stall_time = ktime_get();
spin_unlock_irqrestore(>lock, flags);
 
+   if (unlikely(ct_deadlocked(ct)))
+   return -EPIPE;
+
if (msleep_interruptible(sleep_period_ms))
return -EINTR;
sleep_period_ms = sleep_period_ms << 1;
@@ -585,6 +628,8 @@ static int ct_send(struct intel_guc_ct *ct,
goto retry;
}
 
+   ct->stall_time = KTIME_MAX;
+
fence = ct_get_next_fence(ct);
request.fence = fence;
request.status = 0;
@@ -647,6 +692,9 @@ int intel_guc_ct_send(struct intel_guc_ct *ct, const u32 
*action, u32 len,
return -ENODEV;
}
 
+   if (unlikely(ct->ctbs.send.broken))
+   return -EPIPE;
+
if (flags & INTEL_GUC_CT_SEND_NB)
return ct_send_nb(ct, action, len, flags);
 
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
index 5bb8bef024c8..bee03794c1eb 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
+++ 

[PATCH 0/7] CT changes required for GuC submission

2021-07-01 Thread Matthew Brost
As part of enabling GuC submission discussed in [1], [2], and [3] we
need optimize and update the CT code as this is now in the critical
path of submission. This series includes the patches to do that which is
the first 7 patches from [3]. The patches should have addressed all the
feedback in [3] and should be ready to merge once CI returns a we get a
few more RBs.

v2: Fix checkpatch warning, address a couple of Michal's comments

Signed-off-by: Matthew Brost 

[1] https://patchwork.freedesktop.org/series/89844/
[2] https://patchwork.freedesktop.org/series/91417/
[3] https://patchwork.freedesktop.org/series/91840/

John Harrison (1):
  drm/i915/guc: Module load failure test for CT buffer creation

Matthew Brost (6):
  drm/i915/guc: Relax CTB response timeout
  drm/i915/guc: Improve error message for unsolicited CT response
  drm/i915/guc: Increase size of CTB buffers
  drm/i915/guc: Add non blocking CTB send function
  drm/i915/guc: Add stall timer to non blocking CTB send function
  drm/i915/guc: Optimize CTB writes and reads

 .../gt/uc/abi/guc_communication_ctb_abi.h |   3 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc.h|  11 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 250 +++---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h |  14 +-
 4 files changed, 232 insertions(+), 46 deletions(-)

-- 
2.28.0



[PATCH 1/7] drm/i915/guc: Relax CTB response timeout

2021-07-01 Thread Matthew Brost
In upcoming patch we will allow more CTB requests to be sent in
parallel to the GuC for processing, so we shouldn't assume any more
that GuC will always reply without 10ms.

Use bigger value hardcoded value of 1s instead.

v2: Add CONFIG_DRM_I915_GUC_CTB_TIMEOUT config option
v3:
 (Daniel Vetter)
  - Use hardcoded value of 1s rather than config option
v4:
 (Michal)
  - Use defines for timeout values

Signed-off-by: Matthew Brost 
Cc: Michal Wajdeczko 
Reviewed-by: Michal Wajdeczko 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
index 43409044528e..b86575b99537 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
@@ -474,14 +474,18 @@ static int wait_for_ct_request_update(struct ct_request 
*req, u32 *status)
/*
 * Fast commands should complete in less than 10us, so sample quickly
 * up to that length of time, then switch to a slower sleep-wait loop.
-* No GuC command should ever take longer than 10ms.
+* No GuC command should ever take longer than 10ms but many GuC
+* commands can be inflight at time, so use a 1s timeout on the slower
+* sleep-wait loop.
 */
+#define GUC_CTB_RESPONSE_TIMEOUT_SHORT_MS 10
+#define GUC_CTB_RESPONSE_TIMEOUT_LONG_MS 1000
 #define done \
(FIELD_GET(GUC_HXG_MSG_0_ORIGIN, READ_ONCE(req->status)) == \
 GUC_HXG_ORIGIN_GUC)
-   err = wait_for_us(done, 10);
+   err = wait_for_us(done, GUC_CTB_RESPONSE_TIMEOUT_SHORT_MS);
if (err)
-   err = wait_for(done, 10);
+   err = wait_for(done, GUC_CTB_RESPONSE_TIMEOUT_LONG_MS);
 #undef done
 
if (unlikely(err))
-- 
2.28.0



[PATCH v7 5/5] drm: protect drm_master pointers in drm_lease.c

2021-07-01 Thread Desmond Cheong Zhi Xi
drm_file->master pointers should be protected by
drm_device.master_mutex or drm_file.master_lock when being
dereferenced.

However, in drm_lease.c, there are multiple instances where
drm_file->master is accessed and dereferenced while neither lock is
held. This makes drm_lease.c vulnerable to use-after-free bugs.

We address this issue in 2 ways:

1. Add a new drm_file_get_master() function that calls drm_master_get
on drm_file->master while holding on to drm_file.master_lock. Since
drm_master_get increments the reference count of master, this
prevents master from being freed until we unreference it with
drm_master_put.

2. In each case where drm_file->master is directly accessed and
eventually dereferenced in drm_lease.c, we wrap the access in a call
to the new drm_file_get_master function, then unreference the master
pointer once we are done using it.

Reported-by: Daniel Vetter 
Signed-off-by: Desmond Cheong Zhi Xi 
Reviewed-by: Emil Velikov 
---
 drivers/gpu/drm/drm_auth.c  | 25 
 drivers/gpu/drm/drm_lease.c | 81 -
 include/drm/drm_auth.h  |  1 +
 include/drm/drm_file.h  |  6 +++
 4 files changed, 93 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/drm_auth.c b/drivers/gpu/drm/drm_auth.c
index fe5b6adc6133..17440ee54f30 100644
--- a/drivers/gpu/drm/drm_auth.c
+++ b/drivers/gpu/drm/drm_auth.c
@@ -390,6 +390,31 @@ struct drm_master *drm_master_get(struct drm_master 
*master)
 }
 EXPORT_SYMBOL(drm_master_get);
 
+/**
+ * drm_file_get_master - reference _file.master of @file_priv
+ * @file_priv: DRM file private
+ *
+ * Increments the reference count of @file_priv's _file.master and returns
+ * the _file.master. If @file_priv has no _file.master, returns NULL.
+ *
+ * Master pointers returned from this function should be unreferenced using
+ * drm_master_put().
+ */
+struct drm_master *drm_file_get_master(struct drm_file *file_priv)
+{
+   struct drm_master *master = NULL;
+
+   mutex_lock(_priv->master_lock);
+   if (!file_priv->master)
+   goto unlock;
+   master = drm_master_get(file_priv->master);
+
+unlock:
+   mutex_unlock(_priv->master_lock);
+   return master;
+}
+EXPORT_SYMBOL(drm_file_get_master);
+
 static void drm_master_destroy(struct kref *kref)
 {
struct drm_master *master = container_of(kref, struct drm_master, 
refcount);
diff --git a/drivers/gpu/drm/drm_lease.c b/drivers/gpu/drm/drm_lease.c
index 00fb433bcef1..92eac73d9001 100644
--- a/drivers/gpu/drm/drm_lease.c
+++ b/drivers/gpu/drm/drm_lease.c
@@ -106,10 +106,19 @@ static bool _drm_has_leased(struct drm_master *master, 
int id)
  */
 bool _drm_lease_held(struct drm_file *file_priv, int id)
 {
-   if (!file_priv || !file_priv->master)
+   bool ret;
+   struct drm_master *master;
+
+   if (!file_priv)
return true;
 
-   return _drm_lease_held_master(file_priv->master, id);
+   master = drm_file_get_master(file_priv);
+   if (!master)
+   return true;
+   ret = _drm_lease_held_master(master, id);
+   drm_master_put();
+
+   return ret;
 }
 
 /**
@@ -128,13 +137,22 @@ bool drm_lease_held(struct drm_file *file_priv, int id)
struct drm_master *master;
bool ret;
 
-   if (!file_priv || !file_priv->master || !file_priv->master->lessor)
+   if (!file_priv)
return true;
 
-   master = file_priv->master;
+   master = drm_file_get_master(file_priv);
+   if (!master)
+   return true;
+   if (!master->lessor) {
+   ret = true;
+   goto out;
+   }
mutex_lock(>dev->mode_config.idr_mutex);
ret = _drm_lease_held_master(master, id);
mutex_unlock(>dev->mode_config.idr_mutex);
+
+out:
+   drm_master_put();
return ret;
 }
 
@@ -154,10 +172,16 @@ uint32_t drm_lease_filter_crtcs(struct drm_file 
*file_priv, uint32_t crtcs_in)
int count_in, count_out;
uint32_t crtcs_out = 0;
 
-   if (!file_priv || !file_priv->master || !file_priv->master->lessor)
+   if (!file_priv)
return crtcs_in;
 
-   master = file_priv->master;
+   master = drm_file_get_master(file_priv);
+   if (!master)
+   return crtcs_in;
+   if (!master->lessor) {
+   crtcs_out = crtcs_in;
+   goto out;
+   }
dev = master->dev;
 
count_in = count_out = 0;
@@ -176,6 +200,9 @@ uint32_t drm_lease_filter_crtcs(struct drm_file *file_priv, 
uint32_t crtcs_in)
count_in++;
}
mutex_unlock(>dev->mode_config.idr_mutex);
+
+out:
+   drm_master_put();
return crtcs_out;
 }
 
@@ -489,7 +516,7 @@ int drm_mode_create_lease_ioctl(struct drm_device *dev,
size_t object_count;
int ret = 0;
struct idr leases;
-   struct drm_master *lessor = lessor_priv->master;
+   struct drm_master *lessor;
struct drm_master *lessee = 

[PATCH v7 4/5] drm: serialize drm_file.master with a master lock

2021-07-01 Thread Desmond Cheong Zhi Xi
Currently, drm_file.master pointers should be protected by
drm_device.master_mutex when being dereferenced. This is because
drm_file.master is not invariant for the lifetime of drm_file. If
drm_file is not the creator of master, then drm_file.is_master is
false, and a call to drm_setmaster_ioctl will invoke
drm_new_set_master, which then allocates a new master for drm_file and
puts the old master.

Thus, without holding drm_device.master_mutex, the old value of
drm_file.master could be freed while it is being used by another
concurrent process.

However, it is not always possible to lock drm_device.master_mutex to
dereference drm_file.master. Through the fbdev emulation code, this
might occur in a deep nest of other locks. But drm_device.master_mutex
is also the outermost lock in the nesting hierarchy, so this leads to
potential deadlocks.

To address this, we introduce a new mutex at the bottom of the lock
hierarchy that only serializes drm_file.master. With this change, the
value of drm_file.master changes only when both
drm_device.master_mutex and drm_file.master_lock are held. Hence, any
process holding either of those locks can ensure that the value of
drm_file.master will not change concurrently.

Since no lock depends on the new drm_file.master_lock, when
drm_file.master is dereferenced, but drm_device.master_mutex cannot be
held, we can safely protect the master pointer with
drm_file.master_lock.

Reported-by: Daniel Vetter 
Signed-off-by: Desmond Cheong Zhi Xi 
---

Since our lock inversions were a result of dev->master_mutex being
used to serialize many other things, perhaps a finer grained lock will
solve the lockdep issues.

 drivers/gpu/drm/drm_auth.c | 10 --
 drivers/gpu/drm/drm_file.c |  1 +
 include/drm/drm_file.h | 12 +---
 3 files changed, 18 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/drm_auth.c b/drivers/gpu/drm/drm_auth.c
index ab1863c5a5a0..fe5b6adc6133 100644
--- a/drivers/gpu/drm/drm_auth.c
+++ b/drivers/gpu/drm/drm_auth.c
@@ -169,11 +169,14 @@ static int drm_new_set_master(struct drm_device *dev, 
struct drm_file *fpriv)
 
WARN_ON(fpriv->is_master);
old_master = fpriv->master;
+   mutex_lock(>master_lock);
fpriv->master = drm_master_create(dev);
if (!fpriv->master) {
fpriv->master = old_master;
+   mutex_unlock(>master_lock);
return -ENOMEM;
}
+   mutex_unlock(>master_lock);
 
fpriv->is_master = 1;
fpriv->authenticated = 1;
@@ -332,10 +335,13 @@ int drm_master_open(struct drm_file *file_priv)
 * any master object for render clients
 */
mutex_lock(>master_mutex);
-   if (!dev->master)
+   if (!dev->master) {
ret = drm_new_set_master(dev, file_priv);
-   else
+   } else {
+   mutex_lock(_priv->master_lock);
file_priv->master = drm_master_get(dev->master);
+   mutex_unlock(_priv->master_lock);
+   }
mutex_unlock(>master_mutex);
 
return ret;
diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c
index d4f0bac6f8f8..8ccadfa1c752 100644
--- a/drivers/gpu/drm/drm_file.c
+++ b/drivers/gpu/drm/drm_file.c
@@ -176,6 +176,7 @@ struct drm_file *drm_file_alloc(struct drm_minor *minor)
init_waitqueue_head(>event_wait);
file->event_space = 4096; /* set aside 4k for event buffer */
 
+   mutex_init(>master_lock);
mutex_init(>event_read_lock);
 
if (drm_core_check_feature(dev, DRIVER_GEM))
diff --git a/include/drm/drm_file.h b/include/drm/drm_file.h
index b81b3bfb08c8..88539f93fc8e 100644
--- a/include/drm/drm_file.h
+++ b/include/drm/drm_file.h
@@ -226,15 +226,21 @@ struct drm_file {
/**
 * @master:
 *
-* Master this node is currently associated with. Only relevant if
-* drm_is_primary_client() returns true. Note that this only
-* matches _device.master if the master is the currently active one.
+* Master this node is currently associated with. Protected by struct
+* _device.master_mutex, and serialized by @master_lock.
+*
+* Only relevant if drm_is_primary_client() returns true. Note that
+* this only matches _device.master if the master is the currently
+* active one.
 *
 * See also @authentication and @is_master and the :ref:`section on
 * primary nodes and authentication `.
 */
struct drm_master *master;
 
+   /** @master_lock: Serializes @master. */
+   struct mutex master_lock;
+
/** @pid: Process that opened this file. */
struct pid *pid;
 
-- 
2.25.1



[PATCH v7 3/5] drm: add a locked version of drm_is_current_master

2021-07-01 Thread Desmond Cheong Zhi Xi
While checking the master status of the DRM file in
drm_is_current_master(), the device's master mutex should be
held. Without the mutex, the pointer fpriv->master may be freed
concurrently by another process calling drm_setmaster_ioctl(). This
could lead to use-after-free errors when the pointer is subsequently
dereferenced in drm_lease_owner().

The callers of drm_is_current_master() from drm_auth.c hold the
device's master mutex, but external callers do not. Hence, we implement
drm_is_current_master_locked() to be used within drm_auth.c, and
modify drm_is_current_master() to grab the device's master mutex
before checking the master status.

Reported-by: Daniel Vetter 
Signed-off-by: Desmond Cheong Zhi Xi 
Reviewed-by: Emil Velikov 
---
 drivers/gpu/drm/drm_auth.c | 51 --
 1 file changed, 32 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/drm_auth.c b/drivers/gpu/drm/drm_auth.c
index f00e5abdbbf4..ab1863c5a5a0 100644
--- a/drivers/gpu/drm/drm_auth.c
+++ b/drivers/gpu/drm/drm_auth.c
@@ -61,6 +61,35 @@
  * trusted clients.
  */
 
+static bool drm_is_current_master_locked(struct drm_file *fpriv)
+{
+   lockdep_assert_held_once(>minor->dev->master_mutex);
+
+   return fpriv->is_master && drm_lease_owner(fpriv->master) == 
fpriv->minor->dev->master;
+}
+
+/**
+ * drm_is_current_master - checks whether @priv is the current master
+ * @fpriv: DRM file private
+ *
+ * Checks whether @fpriv is current master on its device. This decides whether 
a
+ * client is allowed to run DRM_MASTER IOCTLs.
+ *
+ * Most of the modern IOCTL which require DRM_MASTER are for kernel modesetting
+ * - the current master is assumed to own the non-shareable display hardware.
+ */
+bool drm_is_current_master(struct drm_file *fpriv)
+{
+   bool ret;
+
+   mutex_lock(>minor->dev->master_mutex);
+   ret = drm_is_current_master_locked(fpriv);
+   mutex_unlock(>minor->dev->master_mutex);
+
+   return ret;
+}
+EXPORT_SYMBOL(drm_is_current_master);
+
 int drm_getmagic(struct drm_device *dev, void *data, struct drm_file 
*file_priv)
 {
struct drm_auth *auth = data;
@@ -223,7 +252,7 @@ int drm_setmaster_ioctl(struct drm_device *dev, void *data,
if (ret)
goto out_unlock;
 
-   if (drm_is_current_master(file_priv))
+   if (drm_is_current_master_locked(file_priv))
goto out_unlock;
 
if (dev->master) {
@@ -272,7 +301,7 @@ int drm_dropmaster_ioctl(struct drm_device *dev, void *data,
if (ret)
goto out_unlock;
 
-   if (!drm_is_current_master(file_priv)) {
+   if (!drm_is_current_master_locked(file_priv)) {
ret = -EINVAL;
goto out_unlock;
}
@@ -321,7 +350,7 @@ void drm_master_release(struct drm_file *file_priv)
if (file_priv->magic)
idr_remove(_priv->master->magic_map, file_priv->magic);
 
-   if (!drm_is_current_master(file_priv))
+   if (!drm_is_current_master_locked(file_priv))
goto out;
 
drm_legacy_lock_master_cleanup(dev, master);
@@ -342,22 +371,6 @@ void drm_master_release(struct drm_file *file_priv)
mutex_unlock(>master_mutex);
 }
 
-/**
- * drm_is_current_master - checks whether @priv is the current master
- * @fpriv: DRM file private
- *
- * Checks whether @fpriv is current master on its device. This decides whether 
a
- * client is allowed to run DRM_MASTER IOCTLs.
- *
- * Most of the modern IOCTL which require DRM_MASTER are for kernel modesetting
- * - the current master is assumed to own the non-shareable display hardware.
- */
-bool drm_is_current_master(struct drm_file *fpriv)
-{
-   return fpriv->is_master && drm_lease_owner(fpriv->master) == 
fpriv->minor->dev->master;
-}
-EXPORT_SYMBOL(drm_is_current_master);
-
 /**
  * drm_master_get - reference a master pointer
  * @master:  drm_master
-- 
2.25.1



[PATCH v7 2/5] drm: separate locks in __drm_mode_object_find

2021-07-01 Thread Desmond Cheong Zhi Xi
In a future patch, _drm_lease_held will dereference drm_file->master
only after making a call to drm_file_get_master. This will increment
the reference count of drm_file->master while holding onto a new
drm_file.master_lock.

In preparation for this, the call to _drm_lease_held should be moved
out from the section locked by >mode_config.idr_mutex. This
avoids creating new lock hierarchies between
>mode_config.idr_mutex and _file->master_lock.

Reported-by: Daniel Vetter 
Signed-off-by: Desmond Cheong Zhi Xi 
---
 drivers/gpu/drm/drm_mode_object.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/drm_mode_object.c 
b/drivers/gpu/drm/drm_mode_object.c
index b26588b52795..83e35ff3b13a 100644
--- a/drivers/gpu/drm/drm_mode_object.c
+++ b/drivers/gpu/drm/drm_mode_object.c
@@ -146,16 +146,18 @@ struct drm_mode_object *__drm_mode_object_find(struct 
drm_device *dev,
if (obj && obj->id != id)
obj = NULL;
 
-   if (obj && drm_mode_object_lease_required(obj->type) &&
-   !_drm_lease_held(file_priv, obj->id))
-   obj = NULL;
-
if (obj && obj->free_cb) {
if (!kref_get_unless_zero(>refcount))
obj = NULL;
}
mutex_unlock(>mode_config.idr_mutex);
 
+   if (obj && drm_mode_object_lease_required(obj->type) &&
+   !_drm_lease_held(file_priv, obj->id)) {
+   drm_mode_object_put(obj);
+   obj = NULL;
+   }
+
return obj;
 }
 
-- 
2.25.1



[PATCH v7 1/5] drm: avoid circular locks in drm_mode_getconnector

2021-07-01 Thread Desmond Cheong Zhi Xi
In preparation for a future patch to take a lock on
drm_device.master_mutex inside drm_is_current_master(), we first move
the call to drm_is_current_master() in drm_mode_getconnector out from the
section locked by >mode_config.mutex. This avoids creating a
circular lock dependency.

Failing to avoid this lock dependency produces the following lockdep
splat:

==
WARNING: possible circular locking dependency detected
5.13.0-rc7-CI-CI_DRM_10254+ #1 Not tainted
--
kms_frontbuffer/1087 is trying to acquire lock:
88810dcd01a8 (>master_mutex){+.+.}-{3:3}, at: 
drm_is_current_master+0x1b/0x40
but task is already holding lock:
88810dcd0488 (>mode_config.mutex){+.+.}-{3:3}, at: 
drm_mode_getconnector+0x1c6/0x4a0
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #2 (>mode_config.mutex){+.+.}-{3:3}:
   __mutex_lock+0xab/0x970
   drm_client_modeset_probe+0x22e/0xca0
   __drm_fb_helper_initial_config_and_unlock+0x42/0x540
   intel_fbdev_initial_config+0xf/0x20 [i915]
   async_run_entry_fn+0x28/0x130
   process_one_work+0x26d/0x5c0
   worker_thread+0x37/0x380
   kthread+0x144/0x170
   ret_from_fork+0x1f/0x30
-> #1 (>modeset_mutex){+.+.}-{3:3}:
   __mutex_lock+0xab/0x970
   drm_client_modeset_commit_locked+0x1c/0x180
   drm_client_modeset_commit+0x1c/0x40
   __drm_fb_helper_restore_fbdev_mode_unlocked+0x88/0xb0
   drm_fb_helper_set_par+0x34/0x40
   intel_fbdev_set_par+0x11/0x40 [i915]
   fbcon_init+0x270/0x4f0
   visual_init+0xc6/0x130
   do_bind_con_driver+0x1e5/0x2d0
   do_take_over_console+0x10e/0x180
   do_fbcon_takeover+0x53/0xb0
   register_framebuffer+0x22d/0x310
   __drm_fb_helper_initial_config_and_unlock+0x36c/0x540
   intel_fbdev_initial_config+0xf/0x20 [i915]
   async_run_entry_fn+0x28/0x130
   process_one_work+0x26d/0x5c0
   worker_thread+0x37/0x380
   kthread+0x144/0x170
   ret_from_fork+0x1f/0x30
-> #0 (>master_mutex){+.+.}-{3:3}:
   __lock_acquire+0x151e/0x2590
   lock_acquire+0xd1/0x3d0
   __mutex_lock+0xab/0x970
   drm_is_current_master+0x1b/0x40
   drm_mode_getconnector+0x37e/0x4a0
   drm_ioctl_kernel+0xa8/0xf0
   drm_ioctl+0x1e8/0x390
   __x64_sys_ioctl+0x6a/0xa0
   do_syscall_64+0x39/0xb0
   entry_SYSCALL_64_after_hwframe+0x44/0xae
other info that might help us debug this:
Chain exists of: >master_mutex --> >modeset_mutex --> 
>mode_config.mutex
 Possible unsafe locking scenario:
   CPU0CPU1
   
  lock(>mode_config.mutex);
   lock(>modeset_mutex);
   lock(>mode_config.mutex);
  lock(>master_mutex);
*** DEADLOCK ***
1 lock held by kms_frontbuffer/1087:
 #0: 88810dcd0488 (>mode_config.mutex){+.+.}-{3:3}, at: 
drm_mode_getconnector+0x1c6/0x4a0
stack backtrace:
CPU: 7 PID: 1087 Comm: kms_frontbuffer Not tainted 5.13.0-rc7-CI-CI_DRM_10254+ 
#1
Hardware name: Intel Corporation Ice Lake Client Platform/IceLake U DDR4 SODIMM 
PD RVP TLC, BIOS ICLSFWR1.R00.3234.A01.1906141750 06/14/2019
Call Trace:
 dump_stack+0x7f/0xad
 check_noncircular+0x12e/0x150
 __lock_acquire+0x151e/0x2590
 lock_acquire+0xd1/0x3d0
 __mutex_lock+0xab/0x970
 drm_is_current_master+0x1b/0x40
 drm_mode_getconnector+0x37e/0x4a0
 drm_ioctl_kernel+0xa8/0xf0
 drm_ioctl+0x1e8/0x390
 __x64_sys_ioctl+0x6a/0xa0
 do_syscall_64+0x39/0xb0
 entry_SYSCALL_64_after_hwframe+0x44/0xae

Reported-by: Daniel Vetter 
Signed-off-by: Desmond Cheong Zhi Xi 
Reviewed-by: Emil Velikov 
---
 drivers/gpu/drm/drm_connector.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/drm_connector.c b/drivers/gpu/drm/drm_connector.c
index da39e7ff6965..2ba257b1ae20 100644
--- a/drivers/gpu/drm/drm_connector.c
+++ b/drivers/gpu/drm/drm_connector.c
@@ -2414,6 +2414,7 @@ int drm_mode_getconnector(struct drm_device *dev, void 
*data,
struct drm_mode_modeinfo u_mode;
struct drm_mode_modeinfo __user *mode_ptr;
uint32_t __user *encoder_ptr;
+   bool is_current_master;
 
if (!drm_core_check_feature(dev, DRIVER_MODESET))
return -EOPNOTSUPP;
@@ -2444,9 +2445,11 @@ int drm_mode_getconnector(struct drm_device *dev, void 
*data,
out_resp->connector_type = connector->connector_type;
out_resp->connector_type_id = connector->connector_type_id;
 
+   is_current_master = drm_is_current_master(file_priv);
+
mutex_lock(>mode_config.mutex);
if (out_resp->count_modes == 0) {
-   if (drm_is_current_master(file_priv))
+   if (is_current_master)
connector->funcs->fill_modes(connector,
 dev->mode_config.max_width,
   

  1   2   >