Re: [Intel-gfx] [PATCH v1 2/2] drm/i915/gem: Migrate to system at dma-buf attach time
Hi "Michael, Thank you for the patch! Yet something to improve: [auto build test ERROR on drm-intel/for-linux-next] [also build test ERROR on v5.13 next-20210701] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/Michael-J-Ruhl/drm-i915-gem-Correct-the-locking-and-pin-pattern-for-dma-buf/20210702-042115 base: git://anongit.freedesktop.org/drm-intel for-linux-next config: x86_64-randconfig-r025-20210630 (attached as .config) compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project 9eb613b2de3163686b1a4bd1160f15ac56a4b083) reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # install x86_64 cross compiling tool for clang build # apt-get install binutils-x86-64-linux-gnu # https://github.com/0day-ci/linux/commit/d1c1ca8d45e76fc2b9ee679c170848e6c6138f6e git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Michael-J-Ruhl/drm-i915-gem-Correct-the-locking-and-pin-pattern-for-dma-buf/20210702-042115 git checkout d1c1ca8d45e76fc2b9ee679c170848e6c6138f6e # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=x86_64 If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot All errors (new ones prefixed by >>): >> drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c:177:7: error: implicit >> declaration of function 'i915_gem_object_can_migrate' >> [-Werror,-Wimplicit-function-declaration] if (!i915_gem_object_can_migrate(obj, INTEL_REGION_SMEM)) ^ drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c:177:7: note: did you mean 'i915_gem_object_pin_map'? drivers/gpu/drm/i915/gem/i915_gem_object.h:452:20: note: 'i915_gem_object_pin_map' declared here void *__must_check i915_gem_object_pin_map(struct drm_i915_gem_object *obj, ^ >> drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c:179:8: error: implicit >> declaration of function 'i915_gem_object_migrate' >> [-Werror,-Wimplicit-function-declaration] ret = i915_gem_object_migrate(obj, NULL, INTEL_REGION_SMEM); ^ drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c:179:8: note: did you mean 'i915_gem_object_can_migrate'? drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c:177:7: note: 'i915_gem_object_can_migrate' declared here if (!i915_gem_object_can_migrate(obj, INTEL_REGION_SMEM)) ^ >> drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c:181:9: error: implicit >> declaration of function 'i915_gem_object_wait_migration' >> [-Werror,-Wimplicit-function-declaration] ret = i915_gem_object_wait_migration(obj, 0); ^ drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c:181:9: note: did you mean 'i915_gem_object_can_migrate'? drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c:177:7: note: 'i915_gem_object_can_migrate' declared here if (!i915_gem_object_can_migrate(obj, INTEL_REGION_SMEM)) ^ 3 errors generated. vim +/i915_gem_object_can_migrate +177 drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c 162 163 /** 164 * i915_gem_dmabuf_attach - Do any extra attach work necessary 165 * @dmabuf: imported dma-buf 166 * @attach: new attach to do work on 167 * 168 */ 169 static int i915_gem_dmabuf_attach(struct dma_buf *dmabuf, 170struct dma_buf_attachment *attach) 171 { 172 struct drm_i915_gem_object *obj = dma_buf_to_obj(dmabuf); 173 int ret; 174 175 assert_object_held(obj); 176 > 177 if (!i915_gem_object_can_migrate(obj, INTEL_REGION_SMEM)) 178 return -EOPNOTSUPP; > 179 ret = i915_gem_object_migrate(obj, NULL, INTEL_REGION_SMEM); 180 if (!ret) > 181 ret = i915_gem_object_wait_migration(obj, 0); 182 if (!ret) 183 ret = i915_gem_object_pin_pages(obj); 184 185 return ret; 186 } 187 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org .config.gz Description: application/gzip
Re: [PATCH] drm/dbi: Print errors for mipi_dbi_command()
Hi Linus, On Fri, Jul 02, 2021 at 12:25:18AM +0200, Linus Walleij wrote: > The macro mipi_dbi_command() does not report errors unless you wrap it > in another macro to do the error reporting. > > Report a rate-limited error so we know what is going on. > > Drop the only user in DRM using mipi_dbi_command() and actually checking > the error explicitly, let it use mipi_dbi_command_buf() directly > instead. > > After this any code wishing to send command arrays can rely on > mipi_dbi_command() providing an appropriate error message if something > goes wrong. > > Suggested-by: Noralf Trønnes > Suggested-by: Douglas Anderson > Signed-off-by: Linus Walleij > --- > drivers/gpu/drm/drm_mipi_dbi.c | 2 +- > include/drm/drm_mipi_dbi.h | 5 - > 2 files changed, 5 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/drm_mipi_dbi.c b/drivers/gpu/drm/drm_mipi_dbi.c > index 3854fb9798e9..c7c1b75df190 100644 > --- a/drivers/gpu/drm/drm_mipi_dbi.c > +++ b/drivers/gpu/drm/drm_mipi_dbi.c > @@ -645,7 +645,7 @@ static int mipi_dbi_poweron_reset_conditional(struct > mipi_dbi_dev *dbidev, bool > return 1; > > mipi_dbi_hw_reset(dbi); > - ret = mipi_dbi_command(dbi, MIPI_DCS_SOFT_RESET); > + ret = mipi_dbi_command_buf(dbi, MIPI_DCS_SOFT_RESET, NULL, 0); > if (ret) { > DRM_DEV_ERROR(dev, "Failed to send reset command (%d)\n", ret); > if (dbidev->regulator) I do not see the value in this change?? There are many other mipi_dbi_command() users and the error return continues to be checked?!??! > diff --git a/include/drm/drm_mipi_dbi.h b/include/drm/drm_mipi_dbi.h > index f543d6e3e822..2057ad32760c 100644 > --- a/include/drm/drm_mipi_dbi.h > +++ b/include/drm/drm_mipi_dbi.h > @@ -183,7 +183,10 @@ int mipi_dbi_buf_copy(void *dst, struct drm_framebuffer > *fb, > #define mipi_dbi_command(dbi, cmd, seq...) \ > ({ \ > const u8 d[] = { seq }; \ > - mipi_dbi_command_stackbuf(dbi, cmd, d, ARRAY_SIZE(d)); \ > + int ret; \ > + ret = mipi_dbi_command_stackbuf(dbi, cmd, d, ARRAY_SIZE(d)); \ > + if (ret) \ > + pr_err_ratelimited("MIPI DBI: error %d when sending command\n", > ret); \ > }) Coud this be more informative if the spi device was printed, it is available? Maybe in 99% of the cases there is only one user anyway so it will not help? Sam
Re: [PATCH] drm/panel: panel-simple: Fix proper bpc for ytc700tlag_05_201c
Hi Sam and Thierry, On Tue, May 25, 2021 at 12:12 AM Jagan Teki wrote: > > ytc700tlag_05_201c panel support 8 bpc not 6 bpc as per > recent testing in i.MX8MM platform. > > Fix it. > > Signed-off-by: Jagan Teki > --- > drivers/gpu/drm/panel/panel-simple.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/panel/panel-simple.c > b/drivers/gpu/drm/panel/panel-simple.c > index 9be050ab372f..6f4151729fb7 100644 > --- a/drivers/gpu/drm/panel/panel-simple.c > +++ b/drivers/gpu/drm/panel/panel-simple.c > @@ -4164,7 +4164,7 @@ static const struct drm_display_mode > yes_optoelectronics_ytc700tlag_05_201c_mode > static const struct panel_desc yes_optoelectronics_ytc700tlag_05_201c = { > .modes = &yes_optoelectronics_ytc700tlag_05_201c_mode, > .num_modes = 1, > - .bpc = 6, > + .bpc = 8, Can you pick this, if all okay. Jagan.
Re: [PATCH] drm/panel: Fix up DT bindings for Samsung lms397kf04
Hi Linus, On Thu, Jul 01, 2021 at 11:36:18PM +0200, Linus Walleij wrote: > Improve the bindings and make them more usable: > > - Pick in spi-cpha and spi-cpol from the SPI node parent, > this will specify that we are "type 3" in the device tree > rather than hardcoding it in the operating system. > - Drop the u32 ref from the SPI frequency: comes in from > the SPI host bindings. > - Make spi-cpha, spi-cpol and port compulsory. > - Update the example with a real-world SPI controller, > spi-gpio. > > Cc: Douglas Anderson > Cc: Noralf Trønnes > Cc: devicet...@vger.kernel.org > Signed-off-by: Linus Walleij Looks good, Reviewed-by: Sam Ravnborg
Re: [PATCH 0/4] mgag200: Various cleanups
Hi Sam Am 01.07.21 um 19:58 schrieb Sam Ravnborg: Hi Thomas, On Thu, Jul 01, 2021 at 02:43:12PM +0200, Thomas Zimmermann wrote: Cleanup several nits in the driver's init code. Also move constant data into the RO data segment. No functional changes. Tested on mgag200 HW. Thomas Zimmermann (4): drm/mgag200: Don't pass flags to drm_dev_register() drm/mgag200: Inline mgag200_device_init() This patch drop a redundant error message too - it had helped me if the changelog had said so but whatever. Sure, I'll add it to the log. drm/mgag200: Extract device type and flags in mgag200_pci_probe() drm/mgag200: Constify LUT for programming bpp Full serie is: Acked-by: Sam Ravnborg Thanks for the review. Best regards Thomas Sam -- Thomas Zimmermann Graphics Driver Developer SUSE Software Solutions Germany GmbH Maxfeldstr. 5, 90409 Nürnberg, Germany (HRB 36809, AG Nürnberg) Geschäftsführer: Felix Imendörffer OpenPGP_signature Description: OpenPGP digital signature
Re: [PATCH 2/2] drm/vc4: hdmi: Convert to gpiod
On Mon, May 24, 2021 at 03:18:52PM +0200, Maxime Ripard wrote: > The new gpiod interface takes care of parsing the GPIO flags and to > return the logical value when accessing an active-low GPIO, so switching > to it simplifies a lot the driver. > > Signed-off-by: Maxime Ripard > --- > drivers/gpu/drm/vc4/vc4_hdmi.c | 24 +++- > drivers/gpu/drm/vc4/vc4_hdmi.h | 3 +-- > 2 files changed, 8 insertions(+), 19 deletions(-) > > diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c > index ccc6c8079dc6..34622c59f6a7 100644 > --- a/drivers/gpu/drm/vc4/vc4_hdmi.c > +++ b/drivers/gpu/drm/vc4/vc4_hdmi.c > @@ -159,10 +159,9 @@ vc4_hdmi_connector_detect(struct drm_connector > *connector, bool force) > struct vc4_hdmi *vc4_hdmi = connector_to_vc4_hdmi(connector); > bool connected = false; > > - if (vc4_hdmi->hpd_gpio) { > - if (gpio_get_value_cansleep(vc4_hdmi->hpd_gpio) ^ > - vc4_hdmi->hpd_active_low) > - connected = true; > + if (vc4_hdmi->hpd_gpio && > + gpiod_get_value_cansleep(vc4_hdmi->hpd_gpio)) { > + connected = true; > } else if (drm_probe_ddc(vc4_hdmi->ddc)) { > connected = true; > } else if (HDMI_READ(HDMI_HOTPLUG) & VC4_HDMI_HOTPLUG_CONNECTED) { > @@ -1993,7 +1992,6 @@ static int vc4_hdmi_bind(struct device *dev, struct > device *master, void *data) > struct vc4_hdmi *vc4_hdmi; > struct drm_encoder *encoder; > struct device_node *ddc_node; > - u32 value; > int ret; > > vc4_hdmi = devm_kzalloc(dev, sizeof(*vc4_hdmi), GFP_KERNEL); > @@ -2031,18 +2029,10 @@ static int vc4_hdmi_bind(struct device *dev, struct > device *master, void *data) > /* Only use the GPIO HPD pin if present in the DT, otherwise >* we'll use the HDMI core's register. >*/ > - if (of_find_property(dev->of_node, "hpd-gpios", &value)) { > - enum of_gpio_flags hpd_gpio_flags; > - > - vc4_hdmi->hpd_gpio = of_get_named_gpio_flags(dev->of_node, > - "hpd-gpios", 0, > - &hpd_gpio_flags); > - if (vc4_hdmi->hpd_gpio < 0) { > - ret = vc4_hdmi->hpd_gpio; > - goto err_put_ddc; > - } > - > - vc4_hdmi->hpd_active_low = hpd_gpio_flags & OF_GPIO_ACTIVE_LOW; > + vc4_hdmi->hpd_gpio = devm_gpiod_get_optional(dev, "hpd", GPIOD_IN); > + if (IS_ERR(vc4_hdmi->hpd_gpio)) { > + ret = PTR_ERR(vc4_hdmi->hpd_gpio); > + goto err_put_ddc; > } > > vc4_hdmi->disable_wifi_frequencies = > diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.h b/drivers/gpu/drm/vc4/vc4_hdmi.h > index 060bcaefbeb5..2688a55461d6 100644 > --- a/drivers/gpu/drm/vc4/vc4_hdmi.h > +++ b/drivers/gpu/drm/vc4/vc4_hdmi.h > @@ -146,8 +146,7 @@ struct vc4_hdmi { > /* VC5 Only */ > void __iomem *rm_regs; > > - int hpd_gpio; > - bool hpd_active_low; > + struct gpio_desc *hpd_gpio; > > /* >* On some systems (like the RPi4), some modes are in the same > -- > 2.31.1 Hi Maxime, This patch as commit 6800234ceee0 ("drm/vc4: hdmi: Convert to gpiod") causes my Raspberry Pi 3 to lock up shortly after boot in combination with commit 411efa18e4b0 ("drm/vc4: hdmi: Move the HSM clock enable to runtime_pm"). The serial console and ssh are completely unresponsive and I do not see any messages in dmesg with "debug ignore_loglevel". The device is running with a 32-bit kernel (multi_v7_defconfig) with 32-bit userspace. If there is any further information that I can provide, please let me know. Cheers, Nathan
Re: [PATCH v15 12/12] of: Add plumbing for restricted DMA pool
Hi, On Thu, Jun 24, 2021 at 11:55:26PM +0800, Claire Chang wrote: > If a device is not behind an IOMMU, we look up the device node and set > up the restricted DMA when the restricted-dma-pool is presented. > > Signed-off-by: Claire Chang > Tested-by: Stefano Stabellini > Tested-by: Will Deacon With this patch in place, all sparc and sparc64 qemu emulations fail to boot. Symptom is that the root file system is not found. Reverting this patch fixes the problem. Bisect log is attached. Guenter --- # bad: [fb0ca446157a86b75502c1636b0d81e642fe6bf1] Add linux-next specific files for 20210701 # good: [62fb9874f5da54fdb243003b386128037319b219] Linux 5.13 git bisect start 'HEAD' 'v5.13' # bad: [f63c4fda987a19b1194cc45cb72fd5bf968d9d90] Merge remote-tracking branch 'rdma/for-next' git bisect bad f63c4fda987a19b1194cc45cb72fd5bf968d9d90 # good: [46bb5dd1d2a63e906e374e97dfd4a5e33934b1c4] Merge remote-tracking branch 'ipsec/master' git bisect good 46bb5dd1d2a63e906e374e97dfd4a5e33934b1c4 # good: [43ba6969cfb8185353a7a6fc79070f13b9e3d6d3] Merge remote-tracking branch 'clk/clk-next' git bisect good 43ba6969cfb8185353a7a6fc79070f13b9e3d6d3 # good: [1ca5eddcf8dca1d6345471c6404e7364af0d7019] Merge remote-tracking branch 'fuse/for-next' git bisect good 1ca5eddcf8dca1d6345471c6404e7364af0d7019 # good: [8f6d7b3248705920187263a4e7147b0752ec7dcf] Merge remote-tracking branch 'pci/next' git bisect good 8f6d7b3248705920187263a4e7147b0752ec7dcf # good: [df1885a755784da3ef285f36d9230c1d090ef186] RDMA/rtrs_clt: Alloc less memory with write path fast memory registration git bisect good df1885a755784da3ef285f36d9230c1d090ef186 # good: [93d31efb58c8ad4a66bbedbc2d082df458c04e45] Merge remote-tracking branch 'cpufreq-arm/cpufreq/arm/linux-next' git bisect good 93d31efb58c8ad4a66bbedbc2d082df458c04e45 # good: [46308965ae6fdc7c25deb2e8c048510ae51bbe66] RDMA/irdma: Check contents of user-space irdma_mem_reg_req object git bisect good 46308965ae6fdc7c25deb2e8c048510ae51bbe66 # good: [6de7a1d006ea9db235492b288312838d6878385f] thermal/drivers/int340x/processor_thermal: Split enumeration and processing part git bisect good 6de7a1d006ea9db235492b288312838d6878385f # good: [081bec2577cda3d04f6559c60b6f4e2242853520] dt-bindings: of: Add restricted DMA pool git bisect good 081bec2577cda3d04f6559c60b6f4e2242853520 # good: [bf95ac0bcd69979af146852f6a617a60285ebbc1] Merge remote-tracking branch 'thermal/thermal/linux-next' git bisect good bf95ac0bcd69979af146852f6a617a60285ebbc1 # good: [3d8287544223a3d2f37981c1f9ffd94d0b5e9ffc] RDMA/core: Always release restrack object git bisect good 3d8287544223a3d2f37981c1f9ffd94d0b5e9ffc # bad: [cff1f23fad6e0bd7d671acce0d15285c709f259c] Merge remote-tracking branch 'swiotlb/linux-next' git bisect bad cff1f23fad6e0bd7d671acce0d15285c709f259c # bad: [b655006619b7bccd0dc1e055bd72de5d613e7b5c] of: Add plumbing for restricted DMA pool git bisect bad b655006619b7bccd0dc1e055bd72de5d613e7b5c # first bad commit: [b655006619b7bccd0dc1e055bd72de5d613e7b5c] of: Add plumbing for restricted DMA pool
Re: [PATCH v2] drm/panfrost:report the full raw fault information instead
Hi Steve, > You didn't answer my previous question: > > > Is this device working with the kbase/DDK proprietary driver? I don't know whether I used kbase/DDK,I only know I used the driver of panfrost in linux 5.11. > What you are describing sounds like a hardware integration issue, so > it would be good to check that the hardware is working with the > proprietary driver to rule that out. And perhaps there is something > in the kbase for this device that is setting a chicken bit to 'fix' > the coherency? I don't have the proprietary driver,I only used driver in linux 5.11. Thinks very much! Chunyou. ?? Thu, 1 Jul 2021 11:15:14 +0100 Steven Price : > On 29/06/2021 04:04, Chunyou Tang wrote: > > Hi Steve, > > thinks for your reply. > > I set the pte in arm_lpae_prot_to_pte(), > > *** > > /* > > * Also Mali has its own notions of shareability wherein its > > Inner > > * domain covers the cores within the GPU, and its Outer > > domain is > > * "outside the GPU" (i.e. either the Inner or System > > domain in CPU > > * terms, depending on coherency). > > */ > > if (prot & IOMMU_CACHE && data->iop.fmt != ARM_MALI_LPAE) > > pte |= ARM_LPAE_PTE_SH_IS; > > else > > pte |= ARM_LPAE_PTE_SH_OS; > > *** > > I set pte |= ARM_LPAE_PTE_SH_NS. > > > > If I set pte to ARM_LPAE_PTE_SH_OS or > > ARM_LPAE_PTE_SH_IS,whether I use singel core GPU or multi > > core GPU,it will occur GPU Fault. > > if I set pte to ARM_LPAE_PTE_SH_NS,whether I use singel core > > GPU or multi core GPU,it will not occur GPU Fault. > > Hi, > > So this is a difference between Panfrost and kbase. Panfrost (well > technically the IOMMU framework) enables the inner-shareable bit for > all memory, whereas kbase only enables it for some memory types (the > BASE_MEM_COHERENT_LOCAL flag in the UABI controls it). However this > should only be a performance/power difference (and AFAIK probably an > irrelevant one) and it's definitely required that "inner shareable" > (i.e. within the GPU) works for communication between the different > units of the GPU. > > You didn't answer my previous question: > > > Is this device working with the kbase/DDK proprietary driver? > > What you are describing sounds like a hardware integration issue, so > it would be good to check that the hardware is working with the > proprietary driver to rule that out. And perhaps there is something > in the kbase for this device that is setting a chicken bit to 'fix' > the coherency? > > Steve
Re: [PATCH 06/53] drm/i915/selftests: Allow for larger engine counts
On Thu, Jul 01, 2021 at 01:23:40PM -0700, Matt Roper wrote: From: John Harrison Increasing the engine count causes a couple of local array variables to exceed the kernel stack limit. So make them dynamic allocations instead. Signed-off-by: John Harrison Signed-off-by: Daniele Ceraolo Spurio Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/gt/selftest_execlists.c | 10 -- .../gpu/drm/i915/gt/selftest_workarounds.c| 32 --- 2 files changed, 29 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/selftest_execlists.c b/drivers/gpu/drm/i915/gt/selftest_execlists.c index 08896ae027d5..1e7fe479 100644 --- a/drivers/gpu/drm/i915/gt/selftest_execlists.c +++ b/drivers/gpu/drm/i915/gt/selftest_execlists.c @@ -3561,12 +3561,16 @@ static int smoke_crescendo(struct preempt_smoke *smoke, unsigned int flags) #define BATCH BIT(0) { struct task_struct *tsk[I915_NUM_ENGINES] = {}; - struct preempt_smoke arg[I915_NUM_ENGINES]; + struct preempt_smoke *arg; struct intel_engine_cs *engine; enum intel_engine_id id; unsigned long count; int err = 0; + arg = kmalloc_array(I915_NUM_ENGINES, sizeof(*arg), GFP_KERNEL); + if (!arg) + return -ENOMEM; + for_each_engine(engine, smoke->gt, id) { arg[id] = *smoke; arg[id].engine = engine; @@ -3574,7 +3578,7 @@ static int smoke_crescendo(struct preempt_smoke *smoke, unsigned int flags) arg[id].batch = NULL; arg[id].count = 0; - tsk[id] = kthread_run(smoke_crescendo_thread, &arg, + tsk[id] = kthread_run(smoke_crescendo_thread, arg, "igt/smoke:%d", id); if (IS_ERR(tsk[id])) { err = PTR_ERR(tsk[id]); @@ -3603,6 +3607,8 @@ static int smoke_crescendo(struct preempt_smoke *smoke, unsigned int flags) pr_info("Submitted %lu crescendo:%x requests across %d engines and %d contexts\n", count, flags, smoke->gt->info.num_engines, smoke->ncontext); + + kfree(arg); return 0; this looks correctly, but apparently this test doesn't test anything as `err` is write-only - there is only one read, but basically to avoid overriding an earlier error. looks like this should be `return err;` ? +Chris This patch itself looks good. Reviewed-by: Lucas De Marchi Lucas De Marchi
[PATCH] drm/dbi: Print errors for mipi_dbi_command()
The macro mipi_dbi_command() does not report errors unless you wrap it in another macro to do the error reporting. Report a rate-limited error so we know what is going on. Drop the only user in DRM using mipi_dbi_command() and actually checking the error explicitly, let it use mipi_dbi_command_buf() directly instead. After this any code wishing to send command arrays can rely on mipi_dbi_command() providing an appropriate error message if something goes wrong. Suggested-by: Noralf Trønnes Suggested-by: Douglas Anderson Signed-off-by: Linus Walleij --- drivers/gpu/drm/drm_mipi_dbi.c | 2 +- include/drm/drm_mipi_dbi.h | 5 - 2 files changed, 5 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/drm_mipi_dbi.c b/drivers/gpu/drm/drm_mipi_dbi.c index 3854fb9798e9..c7c1b75df190 100644 --- a/drivers/gpu/drm/drm_mipi_dbi.c +++ b/drivers/gpu/drm/drm_mipi_dbi.c @@ -645,7 +645,7 @@ static int mipi_dbi_poweron_reset_conditional(struct mipi_dbi_dev *dbidev, bool return 1; mipi_dbi_hw_reset(dbi); - ret = mipi_dbi_command(dbi, MIPI_DCS_SOFT_RESET); + ret = mipi_dbi_command_buf(dbi, MIPI_DCS_SOFT_RESET, NULL, 0); if (ret) { DRM_DEV_ERROR(dev, "Failed to send reset command (%d)\n", ret); if (dbidev->regulator) diff --git a/include/drm/drm_mipi_dbi.h b/include/drm/drm_mipi_dbi.h index f543d6e3e822..2057ad32760c 100644 --- a/include/drm/drm_mipi_dbi.h +++ b/include/drm/drm_mipi_dbi.h @@ -183,7 +183,10 @@ int mipi_dbi_buf_copy(void *dst, struct drm_framebuffer *fb, #define mipi_dbi_command(dbi, cmd, seq...) \ ({ \ const u8 d[] = { seq }; \ - mipi_dbi_command_stackbuf(dbi, cmd, d, ARRAY_SIZE(d)); \ + int ret; \ + ret = mipi_dbi_command_stackbuf(dbi, cmd, d, ARRAY_SIZE(d)); \ + if (ret) \ + pr_err_ratelimited("MIPI DBI: error %d when sending command\n", ret); \ }) #ifdef CONFIG_DEBUG_FS -- 2.31.1
Re: [Intel-gfx] [PATCH 05/53] drm/i915/gen12: Use fuse info to enable SFC
On Thu, Jul 01, 2021 at 01:23:39PM -0700, Matt Roper wrote: From: Venkata Sandeep Dhanalakota In Gen12 there are various fuse combinations and in each configuration vdbox engine may be connected to SFC depending on which engines are available, so we need to set the SFC capability based on fuse value from the hardware. Even numbered phyical instance always have SFC, odd numbered physical instances have SFC only if previous even instance is fused off. Bspec: 48028 considering that in TGL we have physical instances 0 and 2 (both even), we can use this logic, so it's correct correct for GRAPHICS_VER(i915) == 12. Although I wonder ifwe should be using MEDIA_VER(i915) here. Cc: Tvrtko Ursulin Cc: Daniele Ceraolo Spurio Signed-off-by: Venkata Sandeep Dhanalakota Signed-off-by: Matt Roper Reviewed-by: Lucas De Marchi Lucas De Marchi --- drivers/gpu/drm/i915/gt/intel_engine_cs.c | 30 ++- 1 file changed, 24 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c index 151870d8fdd3..4ab2c9abb943 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -442,6 +442,28 @@ void intel_engines_free(struct intel_gt *gt) } } +static inline +bool vdbox_has_sfc(struct drm_i915_private *i915, unsigned int physical_vdbox, + unsigned int logical_vdbox, u16 vdbox_mask) +{ + /* +* In Gen11, only even numbered logical VDBOXes are hooked +* up to an SFC (Scaler & Format Converter) unit. +* In Gen12, Even numbered phyical instance always are connected +* to an SFC. Odd numbered physical instances have SFC only if +* previous even instance is fused off. +*/ + if (GRAPHICS_VER(i915) == 12) { + return (physical_vdbox % 2 == 0) || + !(BIT(physical_vdbox - 1) & vdbox_mask); + } else if (GRAPHICS_VER(i915) == 11) { + return logical_vdbox % 2 == 0; + } + + MISSING_CASE(GRAPHICS_VER(i915)); + return false; +} + /* * Determine which engines are fused off in our particular hardware. * Note that we have a catch-22 situation where we need to be able to access @@ -493,13 +515,9 @@ static intel_engine_mask_t init_engine_mask(struct intel_gt *gt) continue; } - /* -* In Gen11, only even numbered logical VDBOXes are -* hooked up to an SFC (Scaler & Format Converter) unit. -* In TGL each VDBOX has access to an SFC. -*/ - if (GRAPHICS_VER(i915) >= 12 || logical_vdbox++ % 2 == 0) + if (vdbox_has_sfc(i915, i, logical_vdbox, vdbox_mask)) gt->info.vdbox_sfc_access |= BIT(i); + logical_vdbox++; } drm_dbg(&i915->drm, "vdbox enable: %04x, instances: %04lx\n", vdbox_mask, VDBOX_MASK(gt)); -- 2.25.4 ___ Intel-gfx mailing list intel-...@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [PATCH 04/53] drm/i915/xehp: VDBOX/VEBOX fusing registers are enable-based
On Thu, Jul 01, 2021 at 01:23:38PM -0700, Matt Roper wrote: From: Tvrtko Ursulin On Xe_HP the fusing register is renamed and changed to have the "enable" semantics, but otherwise remains compatible (mmio address, bitmask ranges) with older platforms. To simplify things we do not add a new register definition but just stop inverting the fusing masks before processing them. Bspec: 33288 This is now: Bspec: 52615 Cc: Daniele Ceraolo Spurio Signed-off-by: Tvrtko Ursulin Signed-off-by: Matt Roper this change above, Reviewed-by: Lucas De Marchi Lucas De Marchi --- drivers/gpu/drm/i915/gt/intel_engine_cs.c | 9 - 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c index 88694822716a..151870d8fdd3 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -468,7 +468,14 @@ static intel_engine_mask_t init_engine_mask(struct intel_gt *gt) if (GRAPHICS_VER(i915) < 11) return info->engine_mask; - media_fuse = ~intel_uncore_read(uncore, GEN11_GT_VEBOX_VDBOX_DISABLE); + /* +* On newer platforms the fusing register is called 'enable' and has +* enable semantics, while on older platforms it is called 'disable' +* and bits have disable semantices. +*/ + media_fuse = intel_uncore_read(uncore, GEN11_GT_VEBOX_VDBOX_DISABLE); + if (GRAPHICS_VER_FULL(i915) < IP_VER(12, 50)) + media_fuse = ~media_fuse; vdbox_mask = media_fuse & GEN11_GT_VDBOX_DISABLE_MASK; vebox_mask = (media_fuse & GEN11_GT_VEBOX_DISABLE_MASK) >> -- 2.25.4
Re: [PATCH 16/53] drm/i915/xehpsdv: add initial XeHP SDV definitions
On Thu, Jul 01, 2021 at 01:23:50PM -0700, Matt Roper wrote: > From: Lucas De Marchi > > XeHP SDV is a Intel® dGPU without display. This is just the definition > of some basic platform macros, by large a copy of current state of > Tigerlake which does not reflect the end state of this platform. > > Bspec: 44467, 48077 > Cc: Rodrigo Vivi > Signed-off-by: Lucas De Marchi > Signed-off-by: Daniele Ceraolo Spurio > Signed-off-by: José Roberto de Souza > Signed-off-by: Stuart Summers > Signed-off-by: Tomas Winkler > Signed-off-by: Matt Roper Reviewed-by: Rodrigo Vivi > --- > drivers/gpu/drm/i915/i915_drv.h | 10 ++ > drivers/gpu/drm/i915/i915_pci.c | 20 > drivers/gpu/drm/i915/intel_device_info.c | 1 + > drivers/gpu/drm/i915/intel_device_info.h | 1 + > 4 files changed, 32 insertions(+) > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > index c02600850246..63bed18a2be7 100644 > --- a/drivers/gpu/drm/i915/i915_drv.h > +++ b/drivers/gpu/drm/i915/i915_drv.h > @@ -1406,6 +1406,7 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915, > #define IS_DG1(dev_priv)IS_PLATFORM(dev_priv, INTEL_DG1) > #define IS_ALDERLAKE_S(dev_priv) IS_PLATFORM(dev_priv, INTEL_ALDERLAKE_S) > #define IS_ALDERLAKE_P(dev_priv) IS_PLATFORM(dev_priv, INTEL_ALDERLAKE_P) > +#define IS_XEHPSDV(dev_priv) IS_PLATFORM(dev_priv, INTEL_XEHPSDV) > #define IS_HSW_EARLY_SDV(dev_priv) (IS_HASWELL(dev_priv) && \ > (INTEL_DEVID(dev_priv) & 0xFF00) == 0x0C00) > #define IS_BDW_ULT(dev_priv) \ > @@ -1564,6 +1565,15 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915, > (IS_ALDERLAKE_P(__i915) && \ >IS_GT_STEP(__i915, since, until)) > > +#define XEHPSDV_REVID_A0 0x0 > +#define XEHPSDV_REVID_A1 0x1 > +#define XEHPSDV_REVID_A_LAST XEHPSDV_REVID_A1 > +#define XEHPSDV_REVID_B0 0x4 > +#define XEHPSDV_REVID_C0 0x8 > + > +#define IS_XEHPSDV_REVID(p, since, until) \ > + (IS_XEHPSDV(p) && IS_REVID(p, since, until)) > + > #define IS_LP(dev_priv) (INTEL_INFO(dev_priv)->is_lp) > #define IS_GEN9_LP(dev_priv) (GRAPHICS_VER(dev_priv) == 9 && IS_LP(dev_priv)) > #define IS_GEN9_BC(dev_priv) (GRAPHICS_VER(dev_priv) == 9 && > !IS_LP(dev_priv)) > diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c > index 88b279452b87..046309e95f43 100644 > --- a/drivers/gpu/drm/i915/i915_pci.c > +++ b/drivers/gpu/drm/i915/i915_pci.c > @@ -1020,6 +1020,26 @@ static const struct intel_device_info adl_p_info = { > .ppgtt_size = 48, \ > .ppgtt_type = INTEL_PPGTT_FULL > > +#define XE_HPM_FEATURES \ > + .media_ver = 12, \ > + .media_ver_release = 50 > + > +__maybe_unused > +static const struct intel_device_info xehpsdv_info = { > + XE_HP_FEATURES, > + XE_HPM_FEATURES, > + DGFX_FEATURES, > + PLATFORM(INTEL_XEHPSDV), > + .display = { }, > + .pipe_mask = 0, > + .platform_engine_mask = > + BIT(RCS0) | BIT(BCS0) | > + BIT(VECS0) | BIT(VECS1) | BIT(VECS2) | BIT(VECS3) | > + BIT(VCS0) | BIT(VCS1) | BIT(VCS2) | BIT(VCS3) | > + BIT(VCS4) | BIT(VCS5) | BIT(VCS6) | BIT(VCS7), > + .require_force_probe = 1, > +}; > + > #undef PLATFORM > > /* > diff --git a/drivers/gpu/drm/i915/intel_device_info.c > b/drivers/gpu/drm/i915/intel_device_info.c > index e8ad14f002c1..7b37b68f4548 100644 > --- a/drivers/gpu/drm/i915/intel_device_info.c > +++ b/drivers/gpu/drm/i915/intel_device_info.c > @@ -68,6 +68,7 @@ static const char * const platform_names[] = { > PLATFORM_NAME(DG1), > PLATFORM_NAME(ALDERLAKE_S), > PLATFORM_NAME(ALDERLAKE_P), > + PLATFORM_NAME(XEHPSDV), > }; > #undef PLATFORM_NAME > > diff --git a/drivers/gpu/drm/i915/intel_device_info.h > b/drivers/gpu/drm/i915/intel_device_info.h > index f824de632cfe..e8684199b0c9 100644 > --- a/drivers/gpu/drm/i915/intel_device_info.h > +++ b/drivers/gpu/drm/i915/intel_device_info.h > @@ -88,6 +88,7 @@ enum intel_platform { > INTEL_DG1, > INTEL_ALDERLAKE_S, > INTEL_ALDERLAKE_P, > + INTEL_XEHPSDV, > INTEL_MAX_PLATFORMS > }; > > -- > 2.25.4 >
Re: [PATCH 23/53] drm/i915/xehpsdv: Read correct RP_STATE_CAP register
On Thu, Jul 01, 2021 at 01:23:57PM -0700, Matt Roper wrote: > The RP_STATE_CAP register is no longer part of the MCHBAR on XEHPSDV; this > register is now a per-tile register at GTTMMADDR offset 0x250014. > > Cc: Rodrigo Vivi > Signed-off-by: Matt Roper > Signed-off-by: Lucas De Marchi Reviewed-by: Rodrigo Vivi > --- > drivers/gpu/drm/i915/gt/intel_rps.c | 4 +++- > drivers/gpu/drm/i915/i915_reg.h | 1 + > 2 files changed, 4 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c > b/drivers/gpu/drm/i915/gt/intel_rps.c > index 490bc1513480..8e7b70248392 100644 > --- a/drivers/gpu/drm/i915/gt/intel_rps.c > +++ b/drivers/gpu/drm/i915/gt/intel_rps.c > @@ -1937,7 +1937,9 @@ u32 intel_rps_read_state_cap(struct intel_rps *rps) > struct drm_i915_private *i915 = rps_to_i915(rps); > struct intel_uncore *uncore = rps_to_uncore(rps); > > - if (IS_GEN9_LP(i915)) > + if (IS_XEHPSDV(i915)) > + return intel_uncore_read(uncore, XEHPSDV_RP_STATE_CAP); > + else if (IS_GEN9_LP(i915)) > return intel_uncore_read(uncore, BXT_RP_STATE_CAP); > else > return intel_uncore_read(uncore, GEN6_RP_STATE_CAP); > diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h > index 0231f42226db..2992e8585399 100644 > --- a/drivers/gpu/drm/i915/i915_reg.h > +++ b/drivers/gpu/drm/i915/i915_reg.h > @@ -4110,6 +4110,7 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg) > #define GEN6_RP_STATE_CAP_MMIO(MCHBAR_MIRROR_BASE_SNB + 0x5998) > #define BXT_RP_STATE_CAP_MMIO(0x138170) > #define GEN9_RP_STATE_LIMITS _MMIO(0x138148) > +#define XEHPSDV_RP_STATE_CAP _MMIO(0x250014) > > /* > * Logical Context regs > -- > 2.25.4 >
[PATCH] drm/panel: Fix up DT bindings for Samsung lms397kf04
Improve the bindings and make them more usable: - Pick in spi-cpha and spi-cpol from the SPI node parent, this will specify that we are "type 3" in the device tree rather than hardcoding it in the operating system. - Drop the u32 ref from the SPI frequency: comes in from the SPI host bindings. - Make spi-cpha, spi-cpol and port compulsory. - Update the example with a real-world SPI controller, spi-gpio. Cc: Douglas Anderson Cc: Noralf Trønnes Cc: devicet...@vger.kernel.org Signed-off-by: Linus Walleij --- .../display/panel/samsung,lms397kf04.yaml | 18 -- 1 file changed, 16 insertions(+), 2 deletions(-) diff --git a/Documentation/devicetree/bindings/display/panel/samsung,lms397kf04.yaml b/Documentation/devicetree/bindings/display/panel/samsung,lms397kf04.yaml index 4cb75a5f2e3a..cd62968426fb 100644 --- a/Documentation/devicetree/bindings/display/panel/samsung,lms397kf04.yaml +++ b/Documentation/devicetree/bindings/display/panel/samsung,lms397kf04.yaml @@ -33,8 +33,11 @@ properties: backlight: true + spi-cpha: true + + spi-cpol: true + spi-max-frequency: -$ref: /schemas/types.yaml#/definitions/uint32 description: inherited as a SPI client node, the datasheet specifies maximum 300 ns minimum cycle which gives around 3 MHz max frequency maximum: 300 @@ -44,6 +47,9 @@ properties: required: - compatible - reg + - spi-cpha + - spi-cpol + - port additionalProperties: false @@ -52,15 +58,23 @@ examples: #include spi { + compatible = "spi-gpio"; + sck-gpios = <&gpio 0 GPIO_ACTIVE_HIGH>; + miso-gpios = <&gpio 1 GPIO_ACTIVE_HIGH>; + mosi-gpios = <&gpio 2 GPIO_ACTIVE_HIGH>; + cs-gpios = <&gpio 3 GPIO_ACTIVE_HIGH>; + num-chipselects = <1>; #address-cells = <1>; #size-cells = <0>; panel@0 { compatible = "samsung,lms397kf04"; spi-max-frequency = <300>; +spi-cpha; +spi-cpol; reg = <0>; vci-supply = <&lcd_3v0_reg>; vccio-supply = <&lcd_1v8_reg>; -reset-gpios = <&gpio 1 GPIO_ACTIVE_LOW>; +reset-gpios = <&gpio 4 GPIO_ACTIVE_LOW>; backlight = <&ktd259>; port { -- 2.31.1
Re: [PATCH v2] drm/meson: fix potential NULL pointer exception in meson_drv_unbind()
Hello, first of all: thanks for your patch and sorry for being late with my review question. On Fri, Jun 18, 2021 at 7:28 AM Jiajun Cao wrote: > > Fix a potential NULL pointer exception when meson_drv_unbind() > attempts to operate on the driver_data priv which may be NULL. > Add a null pointer check on the priv struct to avoid the NULL > pointer dereference after calling dev_get_drvdata(), just like > the null pointer checks done on the struct priv in the function > meson_drv_shutdown(), meson_drv_pm_suspend() and meson_drv_pm_resume(). I am trying to review Amlogic Meson related patches in the DRM subsystem so I can help Neil with this. However, I am still new to this so please help me educate on this topic. [...] > static void meson_drv_unbind(struct device *dev) > { > struct meson_drm *priv = dev_get_drvdata(dev); > - struct drm_device *drm = priv->drm; > + struct drm_device *drm; > + > + if (!priv) > + return; My understanding of the component framework is that meson_drv_unbind() is only called if previously meson_drv_bind() was called (and did not return any error). This is different from meson_drv_shutdown() (for example) because that can be called if meson_drv_probe() returns 0 (success) in case the "count" variable was 0 (then the probe function does nothing). As I mentioned before: I am still learning about the DRM subsystem in the Linux kernel. So it would be great if you could help me understand for which scenarios this newly added if-condition is needed. Thank you! Best regards, Martin
Re: [git pull] drm for 5.14-rc1
Am 2021-07-01 um 4:15 p.m. schrieb Linus Torvalds: > On Wed, Jun 30, 2021 at 9:34 PM Dave Airlie wrote: >> Hi Linus, >> >> This is the main drm pull request for 5.14-rc1. >> >> I've done a test pull into your current tree, and hit two conflicts >> (one in vc4, one in amdgpu), both seem pretty trivial, the amdgpu one >> is recent and sfr sent out a resolution for it today. > Well, the resolutions may be trivial, but the conflict made me look at > the code, and it's buggy. > > Commit 04d8d73dbcbe ("drm/amdgpu: add common HMM get pages function") > is broken. It made the code do > > mmap_read_lock(mm); > vma = find_vma(mm, start); > mmap_read_unlock(mm); > > and then it *uses* that "vma" after it has dropped the lock. > > That's a big no-no - once you've dropped the lock, the vma contents > simply aren't reliable any more. That mapping could now be unmapped > and removed at any time. > > Now, the conflict actually made one of the uses go away (switching to > vma_lookup() means that the subsequent code no longer needs to look at > "vm_start" to verify we're actually _inside_ the vma), but it still > checks for vma->vm_file afterwards. > > So those locking changes in commit 04d8d73dbcbe are completely bogus. > > I tried to fix up that bug while handling the conflict, but who knows > what else similar is going on elsewhere. > > So I would ask people to > > (a) verify that I didn't make things worse as I fixed things up (note > how I had to change the last argument to amdgpu_hmm_range_get_pages() > from false to true etc). > > (b) go and look at their vma lookup code: you can't just look up a > vma under the lock, and then drop the lock, and then think things stay > stable. > > In particular for that (b) case: it is *NOT* enough to look up > vma->vm_file inside the lock and cache that. No - if the test is about > "no backing file before looking up pages", then you have to *keep* > holding the lock until after you've actually looked up the pages! > > Because otherwise any test for "vma->vm_file" is entirely pointless, > for the same reason it's buggy to even look at it after dropping the > lock: because once you've dropped the lock, the thing you just tested > for might not be true any more. > > So no, it's not valid to do > > bool has_file = vma && vma->vm_file; > > and then drop the lock, because you don't use 'vma' any more as a > pointer, and then use 'has_file' outside the lock. Because after > you've dropped the lock, 'has_file' is now meaningless. > > So it's not just about "you can't look at vma->vm_file after dropping > the lock". It's more fundamental than that. Any *decision* you make > based on the vma is entirely pointless and moot after the lock is > dropped! > > Did I fix it up correctly? Who knows. The code makes more sense to me > now and seems valid. But I really *really* want to stress how locking > is important. Thank you for the fix and the explanation. Your fix looks correct. I also double-checked all other uses of find_vma in the amdgpu driver. They all hold the mmap lock correctly. Two comments: With this fix, we could remove the bool mmap_locked parameter from amdgpu_hmm_range_get_pages because it always gets called with the lock held now. You're now holding the mmap lock from the vma_lookup until hmm_range_fault is done. This ensures that the result of the vma->vm_file check remains valid. This was broken even before our commit 04d8d73dbcbe ("drm/amdgpu: add common HMM get pages function"). > > You also can't just unlock in the middle of an operation - even if you > then take the lock *again* later (as amdgpu_hmm_range_get_pages() then > did), the fact that you unlocked in the middle means that all the > earlier tests you did are simply no longer valid when you re-take the > lock. I agree completely. I catch a lot of locking bugs in code review. I probably missed this one because I wasn't paying enough attention to what was being protected by the mmap_read_lock in this case. Regards, Felix > > Linus
Re: [Freedreno] [RFC 2/6] drm/msm/dpu: support setting up two independent DSI connectors
On 2021-06-09 14:17, Dmitry Baryshkov wrote: Move setting up encoders from set_encoder_mode to _dpu_kms_initialize_dsi() / _dpu_kms_initialize_displayport(). This allows us to support not only "single DSI" and "dual DSI" but also "two independent DSI" configurations. In future this would also help adding support for multiple DP connectors. Signed-off-by: Dmitry Baryshkov I will have to see Bjorn's changes to check why it was dependent on this cleanup. Is the plan to call _dpu_kms_initialize_displayport() twice? But still I am not able to put together where is the dependency on that series with this one. Can you please elaborate on that a little bit? --- drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c | 89 - 1 file changed, 44 insertions(+), 45 deletions(-) diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c index 1d3a4f395e74..b63e1c948ff2 100644 --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c @@ -471,30 +471,55 @@ static int _dpu_kms_initialize_dsi(struct drm_device *dev, struct dpu_kms *dpu_kms) { struct drm_encoder *encoder = NULL; + struct msm_display_info info; int i, rc = 0; if (!(priv->dsi[0] || priv->dsi[1])) return rc; - /*TODO: Support two independent DSI connectors */ - encoder = dpu_encoder_init(dev, DRM_MODE_ENCODER_DSI); - if (IS_ERR(encoder)) { - DPU_ERROR("encoder init failed for dsi display\n"); - return PTR_ERR(encoder); - } - - priv->encoders[priv->num_encoders++] = encoder; - for (i = 0; i < ARRAY_SIZE(priv->dsi); i++) { if (!priv->dsi[i]) continue; + if (!encoder) { + encoder = dpu_encoder_init(dev, DRM_MODE_ENCODER_DSI); + if (IS_ERR(encoder)) { + DPU_ERROR("encoder init failed for dsi display\n"); + return PTR_ERR(encoder); + } + + priv->encoders[priv->num_encoders++] = encoder; + + memset(&info, 0, sizeof(info)); + info.intf_type = encoder->encoder_type; + info.capabilities = msm_dsi_is_cmd_mode(priv->dsi[i]) ? + MSM_DISPLAY_CAP_CMD_MODE : + MSM_DISPLAY_CAP_VID_MODE; + } + rc = msm_dsi_modeset_init(priv->dsi[i], dev, encoder); if (rc) { DPU_ERROR("modeset_init failed for dsi[%d], rc = %d\n", i, rc); break; } + + info.h_tile_instance[info.num_of_h_tiles++] = i; + + if (!msm_dsi_is_dual_dsi(priv->dsi[i])) { I would like to clarify the terminology of dual_dsi in the current DSI driver before the rest of the reviews. Today IS_DUAL_DSI() means that two DSIs are driving the same display and the two DSIs are operating in master-slave mode and are being driven by the same PLL. Usually, dual independent DSI means two DSIs driving two separate panels using two separate PLLs ( DSI0 with PLL0 and DSI1 with PLL1) I assume thats happening due to the foll logic and both DSI PHYs are operating in STANDALONE mode: if (!IS_DUAL_DSI()) { ret = msm_dsi_host_register(msm_dsi->host, true); if (ret) return ret; msm_dsi_phy_set_usecase(msm_dsi->phy, MSM_DSI_PHY_STANDALONE); ret = msm_dsi_host_set_src_pll(msm_dsi->host, msm_dsi->phy); + rc = dpu_encoder_setup(dev, encoder, &info); + if (rc) + DPU_ERROR("failed to setup DPU encoder %d: rc:%d\n", + encoder->base.id, rc); + encoder = NULL; + } + } + + if (encoder) { We will hit this case only for split-DSI right? ( that is two DSIs driving the same panel ). Even single DSI will be created in the above loop now. So this looks a bit confusing at the moment. I think we need to be more clear on dual-DSI Vs split-DSI to avoid confusion in the code about which one means what and the one which we are currently using. So what about having IS_DUAL_DSI() and IS_SPLIT_DSI() to distinguish the terminologies and chaging DSI driver accordingly. + rc = dpu_encoder_setup(dev, encoder, &info); + if (rc) + DPU_ERROR("failed to setup DPU encoder %d: rc:%d\n", + encoder->base.id, rc); } return rc; @@ -505,6 +530,7 @@ static int _dpu_kms_initialize_displayport(struct drm_device *dev, struct dpu_kms *dpu_kms) { struct drm_encoder *encode
Re: [PATCH] drm/msm/dsi: drop gdsc regulator handling
On Wed 30 Jun 19:00 CDT 2021, Dmitry Baryshkov wrote: > None of supported devies uses "gdsc" regulator for DSI. GDSC support is > now implemented as a power domain. Drop old code and config handling > gdsc regulator requesting and enabling. > > Signed-off-by: Dmitry Baryshkov Reviewed-by: Bjorn Andersson Regards, Bjorn > --- > drivers/gpu/drm/msm/dsi/dsi_cfg.c | 12 > drivers/gpu/drm/msm/dsi/dsi_host.c | 22 +++--- > 2 files changed, 7 insertions(+), 27 deletions(-) > > diff --git a/drivers/gpu/drm/msm/dsi/dsi_cfg.c > b/drivers/gpu/drm/msm/dsi/dsi_cfg.c > index f3f1c03c7db9..32c37d7c2109 100644 > --- a/drivers/gpu/drm/msm/dsi/dsi_cfg.c > +++ b/drivers/gpu/drm/msm/dsi/dsi_cfg.c > @@ -32,9 +32,8 @@ static const char * const dsi_6g_bus_clk_names[] = { > static const struct msm_dsi_config msm8974_apq8084_dsi_cfg = { > .io_offset = DSI_6G_REG_SHIFT, > .reg_cfg = { > - .num = 4, > + .num = 3, > .regs = { > - {"gdsc", -1, -1}, > {"vdd", 15, 100}, /* 3.0 V */ > {"vdda", 10, 100}, /* 1.2 V */ > {"vddio", 10, 100}, /* 1.8 V */ > @@ -53,9 +52,8 @@ static const char * const dsi_8916_bus_clk_names[] = { > static const struct msm_dsi_config msm8916_dsi_cfg = { > .io_offset = DSI_6G_REG_SHIFT, > .reg_cfg = { > - .num = 3, > + .num = 2, > .regs = { > - {"gdsc", -1, -1}, > {"vdda", 10, 100}, /* 1.2 V */ > {"vddio", 10, 100}, /* 1.8 V */ > }, > @@ -73,9 +71,8 @@ static const char * const dsi_8976_bus_clk_names[] = { > static const struct msm_dsi_config msm8976_dsi_cfg = { > .io_offset = DSI_6G_REG_SHIFT, > .reg_cfg = { > - .num = 3, > + .num = 2, > .regs = { > - {"gdsc", -1, -1}, > {"vdda", 10, 100}, /* 1.2 V */ > {"vddio", 10, 100}, /* 1.8 V */ > }, > @@ -89,9 +86,8 @@ static const struct msm_dsi_config msm8976_dsi_cfg = { > static const struct msm_dsi_config msm8994_dsi_cfg = { > .io_offset = DSI_6G_REG_SHIFT, > .reg_cfg = { > - .num = 7, > + .num = 6, > .regs = { > - {"gdsc", -1, -1}, > {"vdda", 10, 100}, /* 1.25 V */ > {"vddio", 10, 100}, /* 1.8 V */ > {"vcca", 1, 100}, /* 1.0 V */ > diff --git a/drivers/gpu/drm/msm/dsi/dsi_host.c > b/drivers/gpu/drm/msm/dsi/dsi_host.c > index ed504fe5074f..66c425d4159c 100644 > --- a/drivers/gpu/drm/msm/dsi/dsi_host.c > +++ b/drivers/gpu/drm/msm/dsi/dsi_host.c > @@ -203,35 +203,22 @@ static const struct msm_dsi_cfg_handler *dsi_get_config( > { > const struct msm_dsi_cfg_handler *cfg_hnd = NULL; > struct device *dev = &msm_host->pdev->dev; > - struct regulator *gdsc_reg; > struct clk *ahb_clk; > int ret; > u32 major = 0, minor = 0; > > - gdsc_reg = regulator_get(dev, "gdsc"); > - if (IS_ERR(gdsc_reg)) { > - pr_err("%s: cannot get gdsc\n", __func__); > - goto exit; > - } > - > ahb_clk = msm_clk_get(msm_host->pdev, "iface"); > if (IS_ERR(ahb_clk)) { > pr_err("%s: cannot get interface clock\n", __func__); > - goto put_gdsc; > + goto exit; > } > > pm_runtime_get_sync(dev); > > - ret = regulator_enable(gdsc_reg); > - if (ret) { > - pr_err("%s: unable to enable gdsc\n", __func__); > - goto put_gdsc; > - } > - > ret = clk_prepare_enable(ahb_clk); > if (ret) { > pr_err("%s: unable to enable ahb_clk\n", __func__); > - goto disable_gdsc; > + goto runtime_put; > } > > ret = dsi_get_version(msm_host->ctrl_base, &major, &minor); > @@ -246,11 +233,8 @@ static const struct msm_dsi_cfg_handler *dsi_get_config( > > disable_clks: > clk_disable_unprepare(ahb_clk); > -disable_gdsc: > - regulator_disable(gdsc_reg); > +runtime_put: > pm_runtime_put_sync(dev); > -put_gdsc: > - regulator_put(gdsc_reg); > exit: > return cfg_hnd; > } > -- > 2.30.2 >
[PATCH 12/53] drm/i915/xehp: Handle new device context ID format
From: Stuart Summers Xe_HP changes the format of the context ID from past platforms. Cc: Robert M. Fosha Signed-off-by: Stuart Summers Signed-off-by: Umesh Nerlige Ramappa Signed-off-by: Matt Roper --- .../drm/i915/gt/intel_execlists_submission.c | 74 --- drivers/gpu/drm/i915/gt/intel_lrc.c | 8 ++ drivers/gpu/drm/i915/gt/intel_lrc_reg.h | 2 + drivers/gpu/drm/i915/i915_perf.c | 29 +--- drivers/gpu/drm/i915/i915_reg.h | 5 ++ 5 files changed, 97 insertions(+), 21 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c index 15ba0d83151a..3a9d99a69ed4 100644 --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c @@ -153,6 +153,12 @@ #define GEN12_CSB_CTX_VALID(csb_dw) \ (FIELD_GET(GEN12_CSB_SW_CTX_ID_MASK, csb_dw) != GEN12_IDLE_CTX_ID) +#define XEHP_CTX_STATUS_SWITCHED_TO_NEW_QUEUE BIT(1) /* upper csb dword */ +#define XEHP_CSB_SW_CTX_ID_MASKGENMASK(31, 10) +#define XEHP_IDLE_CTX_ID 0x +#define XEHP_CSB_CTX_VALID(csb_dw) \ + (FIELD_GET(XEHP_CSB_SW_CTX_ID_MASK, csb_dw) != XEHP_IDLE_CTX_ID) + /* Typical size of the average request (2 pipecontrols and a MI_BB) */ #define EXECLISTS_REQUEST_SIZE 64 /* bytes */ @@ -490,6 +496,16 @@ __execlists_schedule_in(struct i915_request *rq) /* Use a fixed tag for OA and friends */ GEM_BUG_ON(ce->tag <= BITS_PER_LONG); ce->lrc.ccid = ce->tag; + } else if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 50)) { + /* We don't need a strict matching tag, just different values */ + unsigned int tag = ffs(READ_ONCE(engine->context_tag)); + + GEM_BUG_ON(tag == 0 || tag >= BITS_PER_LONG); + clear_bit(tag - 1, &engine->context_tag); + ce->lrc.ccid = tag << (XEHP_SW_CTX_ID_SHIFT - 32); + + BUILD_BUG_ON(BITS_PER_LONG > GEN12_MAX_CONTEXT_HW_ID); + } else { /* We don't need a strict matching tag, just different values */ unsigned int tag = __ffs(engine->context_tag); @@ -600,8 +616,14 @@ static void __execlists_schedule_out(struct i915_request * const rq, intel_engine_add_retire(engine, ce->timeline); ccid = ce->lrc.ccid; - ccid >>= GEN11_SW_CTX_ID_SHIFT - 32; - ccid &= GEN12_MAX_CONTEXT_HW_ID; + if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 50)) { + ccid >>= XEHP_SW_CTX_ID_SHIFT - 32; + ccid &= XEHP_MAX_CONTEXT_HW_ID; + } else { + ccid >>= GEN11_SW_CTX_ID_SHIFT - 32; + ccid &= GEN12_MAX_CONTEXT_HW_ID; + } + if (ccid < BITS_PER_LONG) { GEM_BUG_ON(ccid == 0); GEM_BUG_ON(test_bit(ccid - 1, &engine->context_tag)); @@ -1660,13 +1682,24 @@ static void invalidate_csb_entries(const u64 *first, const u64 *last) * bits 44-46: reserved * bits 47-57: sw context id of the lrc the GT switched away from * bits 58-63: sw counter of the lrc the GT switched away from + * + * Xe_HP csb shuffles things around compared to TGL: + * + * bits 0-3: context switch detail (same possible values as TGL) + * bits 4-9: engine instance + * bits 10-25: sw context id of the lrc the GT switched to + * bits 26-31: sw counter of the lrc the GT switched to + * bit 32:semaphore wait mode (poll or signal), Only valid when + * switch detail is set to "wait on semaphore" + * bit 33:switched to new queue + * bits 34-41: wait detail (for switch detail 1 to 4) + * bits 42-57: sw context id of the lrc the GT switched away from + * bits 58-63: sw counter of the lrc the GT switched away from */ -static bool gen12_csb_parse(const u64 csb) +static inline bool +__gen12_csb_parse(bool ctx_to_valid, bool ctx_away_valid, bool new_queue, + u8 switch_detail) { - bool ctx_away_valid = GEN12_CSB_CTX_VALID(upper_32_bits(csb)); - bool new_queue = - lower_32_bits(csb) & GEN12_CTX_STATUS_SWITCHED_TO_NEW_QUEUE; - /* * The context switch detail is not guaranteed to be 5 when a preemption * occurs, so we can't just check for that. The check below works for @@ -1675,7 +1708,7 @@ static bool gen12_csb_parse(const u64 csb) * would require some extra handling, but we don't support that. */ if (!ctx_away_valid || new_queue) { - GEM_BUG_ON(!GEN12_CSB_CTX_VALID(lower_32_bits(csb))); + GEM_BUG_ON(!ctx_to_valid); return true; } @@ -1684,10 +1717,26 @@ static bool gen12_csb_parse(const u64 csb) * context switch on an unsuccessful wait instruction since we always * use polling mode. */ -
[PATCH 40/53] drm/i915/dg2: Don't read DRAM info
DG2 does not use system DRAM information for BW_BUDDY programming or watermark workarounds, so there's no need to read this out at startup. Cc: Anusha Srivatsa Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/intel_dram.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_dram.c b/drivers/gpu/drm/i915/intel_dram.c index 879b0f007be3..9675bb94b70b 100644 --- a/drivers/gpu/drm/i915/intel_dram.c +++ b/drivers/gpu/drm/i915/intel_dram.c @@ -494,15 +494,15 @@ void intel_dram_detect(struct drm_i915_private *i915) struct dram_info *dram_info = &i915->dram_info; int ret; + if (GRAPHICS_VER(i915) < 9 || IS_DG2(i915) || !HAS_DISPLAY(i915)) + return; + /* * Assume level 0 watermark latency adjustment is needed until proven * otherwise, this w/a is not needed by bxt/glk. */ dram_info->wm_lv_0_adjust_needed = !IS_GEN9_LP(i915); - if (GRAPHICS_VER(i915) < 9 || !HAS_DISPLAY(i915)) - return; - if (GRAPHICS_VER(i915) >= 12) ret = gen12_get_dram_info(i915); else if (GRAPHICS_VER(i915) >= 11) -- 2.25.4
[PATCH 35/53] drm/i915/dg2: Skip shared DPLL handling
DG2 has no shared DPLL's or DDI clock muxing. The Port PLL is embedded within the PHY. Bspec: 54032 Bspec: 54034 Cc: Lucas De Marchi Cc: Mohammed Khajapasha Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/display/intel_display.c | 10 +++--- drivers/gpu/drm/i915/display/intel_dpll_mgr.c | 5 - 2 files changed, 11 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c index 026c28c612f0..c673d0c8fb4a 100644 --- a/drivers/gpu/drm/i915/display/intel_display.c +++ b/drivers/gpu/drm/i915/display/intel_display.c @@ -3474,7 +3474,8 @@ static void icl_ddi_bigjoiner_pre_enable(struct intel_atomic_state *state, * Enable sequence steps 1-7 on bigjoiner master */ intel_encoders_pre_pll_enable(state, master); - intel_enable_shared_dpll(master_crtc_state); + if (master_crtc_state->shared_dpll) + intel_enable_shared_dpll(master_crtc_state); intel_encoders_pre_enable(state, master); /* and DSC on slave */ @@ -8633,10 +8634,11 @@ intel_pipe_config_compare(const struct intel_crtc_state *current_config, PIPE_CONF_CHECK_BOOL(double_wide); - PIPE_CONF_CHECK_P(shared_dpll); + if (dev_priv->dpll.mgr) + PIPE_CONF_CHECK_P(shared_dpll); /* FIXME do the readout properly and get rid of this quirk */ - if (!PIPE_CONF_QUIRK(PIPE_CONFIG_QUIRK_BIGJOINER_SLAVE)) { + if (dev_priv->dpll.mgr && !PIPE_CONF_QUIRK(PIPE_CONFIG_QUIRK_BIGJOINER_SLAVE)) { PIPE_CONF_CHECK_X(dpll_hw_state.dpll); PIPE_CONF_CHECK_X(dpll_hw_state.dpll_md); PIPE_CONF_CHECK_X(dpll_hw_state.fp0); @@ -8668,7 +8670,9 @@ intel_pipe_config_compare(const struct intel_crtc_state *current_config, PIPE_CONF_CHECK_X(dpll_hw_state.mg_pll_ssc); PIPE_CONF_CHECK_X(dpll_hw_state.mg_pll_bias); PIPE_CONF_CHECK_X(dpll_hw_state.mg_pll_tdc_coldst_bias); + } + if (!PIPE_CONF_QUIRK(PIPE_CONFIG_QUIRK_BIGJOINER_SLAVE)) { PIPE_CONF_CHECK_X(dsi_pll.ctrl); PIPE_CONF_CHECK_X(dsi_pll.div); diff --git a/drivers/gpu/drm/i915/display/intel_dpll_mgr.c b/drivers/gpu/drm/i915/display/intel_dpll_mgr.c index 882bfd499e55..5688d9704636 100644 --- a/drivers/gpu/drm/i915/display/intel_dpll_mgr.c +++ b/drivers/gpu/drm/i915/display/intel_dpll_mgr.c @@ -4462,7 +4462,10 @@ void intel_shared_dpll_init(struct drm_device *dev) const struct dpll_info *dpll_info; int i; - if (IS_ALDERLAKE_P(dev_priv)) + if (IS_DG2(dev_priv)) + /* No shared DPLLs on DG2; port PLLs are part of the PHY */ + dpll_mgr = NULL; + else if (IS_ALDERLAKE_P(dev_priv)) dpll_mgr = &adlp_pll_mgr; else if (IS_ALDERLAKE_S(dev_priv)) dpll_mgr = &adls_pll_mgr; -- 2.25.4
[PATCH 36/53] drm/i915/dg2: Don't wait for AUX power well enable ACKs
On DG2 we're supposed to just wait 600us after programming the well before moving on; there won't be an ack from the hardware. Bspec: 49296 Signed-off-by: Matt Roper --- .../gpu/drm/i915/display/intel_display_power.c | 16 .../gpu/drm/i915/display/intel_display_power.h | 6 ++ 2 files changed, 22 insertions(+) diff --git a/drivers/gpu/drm/i915/display/intel_display_power.c b/drivers/gpu/drm/i915/display/intel_display_power.c index 285380079aab..c34ff0947b85 100644 --- a/drivers/gpu/drm/i915/display/intel_display_power.c +++ b/drivers/gpu/drm/i915/display/intel_display_power.c @@ -341,6 +341,17 @@ static void hsw_wait_for_power_well_enable(struct drm_i915_private *dev_priv, { const struct i915_power_well_regs *regs = power_well->desc->hsw.regs; int pw_idx = power_well->desc->hsw.idx; + int enable_delay = power_well->desc->hsw.fixed_enable_delay; + + /* +* For some power wells we're not supposed to watch the status bit for +* an ack, but rather just wait a fixed amount of time and then +* proceed. This is only used on DG2. +*/ + if (IS_DG2(dev_priv) && enable_delay) { + usleep_range(enable_delay, 2 * enable_delay); + return; + } /* Timeout for PW1:10 us, AUX:not specified, other PWs:20 us. */ if (intel_de_wait_for_set(dev_priv, regs->driver, @@ -4828,6 +4839,7 @@ static const struct i915_power_well_desc xelpd_power_wells[] = { { .hsw.regs = &icl_aux_power_well_regs, .hsw.idx = ICL_PW_CTL_IDX_AUX_A, + .hsw.fixed_enable_delay = 600, }, }, { @@ -4838,6 +4850,7 @@ static const struct i915_power_well_desc xelpd_power_wells[] = { { .hsw.regs = &icl_aux_power_well_regs, .hsw.idx = ICL_PW_CTL_IDX_AUX_B, + .hsw.fixed_enable_delay = 600, }, }, { @@ -4848,6 +4861,7 @@ static const struct i915_power_well_desc xelpd_power_wells[] = { { .hsw.regs = &icl_aux_power_well_regs, .hsw.idx = ICL_PW_CTL_IDX_AUX_C, + .hsw.fixed_enable_delay = 600, }, }, { @@ -4858,6 +4872,7 @@ static const struct i915_power_well_desc xelpd_power_wells[] = { { .hsw.regs = &icl_aux_power_well_regs, .hsw.idx = XELPD_PW_CTL_IDX_AUX_D, + .hsw.fixed_enable_delay = 600, }, }, { @@ -4878,6 +4893,7 @@ static const struct i915_power_well_desc xelpd_power_wells[] = { { .hsw.regs = &icl_aux_power_well_regs, .hsw.idx = TGL_PW_CTL_IDX_AUX_TC1, + .hsw.fixed_enable_delay = 600, }, }, { diff --git a/drivers/gpu/drm/i915/display/intel_display_power.h b/drivers/gpu/drm/i915/display/intel_display_power.h index 4f0917df4375..22367b5cba96 100644 --- a/drivers/gpu/drm/i915/display/intel_display_power.h +++ b/drivers/gpu/drm/i915/display/intel_display_power.h @@ -223,6 +223,12 @@ struct i915_power_well_desc { u8 idx; /* Mask of pipes whose IRQ logic is backed by the pw */ u8 irq_pipe_mask; + /* +* Instead of waiting for the status bit to ack enables, +* just wait a specific amount of time and then consider +* the well enabled. +*/ + u16 fixed_enable_delay; /* The pw is backing the VGA functionality */ bool has_vga:1; bool has_fuses:1; -- 2.25.4
[PATCH 49/53] drm/i915/dg2: Add DG2 to the PSR2 defeature list
From: José Roberto de Souza PSR2 is not supported on DG2. Cc: Caz Yokoyama Cc: Gwan-gyeong Mun Signed-off-by: José Roberto de Souza Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/display/intel_psr.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/display/intel_psr.c b/drivers/gpu/drm/i915/display/intel_psr.c index 4ba5337064ea..422e48927b5b 100644 --- a/drivers/gpu/drm/i915/display/intel_psr.c +++ b/drivers/gpu/drm/i915/display/intel_psr.c @@ -866,7 +866,8 @@ static bool intel_psr2_config_valid(struct intel_dp *intel_dp, } /* Wa_16011181250 */ - if (IS_ROCKETLAKE(dev_priv) || IS_ALDERLAKE_S(dev_priv)) { + if (IS_ROCKETLAKE(dev_priv) || IS_ALDERLAKE_S(dev_priv) || + IS_DG2(dev_priv)) { drm_dbg_kms(&dev_priv->drm, "PSR2 is defeatured for this platform\n"); return false; } -- 2.25.4
[PATCH 48/53] drm/i915/dg2: Update lane disable power state during PSR
From: Gwan-gyeong Mun The PSR enable/disable sequences now require that we program an extra register in the PHY to adjust the lane disable power setting. Bspec: 49274 Bspec: 53885 Cc: Anusha Srivatsa Signed-off-by: Matt Roper Signed-off-by: Gwan-gyeong Mun --- drivers/gpu/drm/i915/display/intel_psr.c | 7 +++ drivers/gpu/drm/i915/display/intel_snps_phy.c | 14 ++ drivers/gpu/drm/i915/display/intel_snps_phy.h | 3 +++ drivers/gpu/drm/i915/i915_reg.h | 3 +++ 4 files changed, 27 insertions(+) diff --git a/drivers/gpu/drm/i915/display/intel_psr.c b/drivers/gpu/drm/i915/display/intel_psr.c index 9643624fe160..4ba5337064ea 100644 --- a/drivers/gpu/drm/i915/display/intel_psr.c +++ b/drivers/gpu/drm/i915/display/intel_psr.c @@ -32,6 +32,7 @@ #include "intel_dp_aux.h" #include "intel_hdmi.h" #include "intel_psr.h" +#include "intel_snps_phy.h" #include "intel_sprite.h" #include "skl_universal_plane.h" @@ -1206,6 +1207,7 @@ static void intel_psr_enable_locked(struct intel_dp *intel_dp, { struct intel_digital_port *dig_port = dp_to_dig_port(intel_dp); struct drm_i915_private *dev_priv = dp_to_i915(intel_dp); + enum phy phy = intel_port_to_phy(dev_priv, dig_port->base.port); struct intel_encoder *encoder = &dig_port->base; u32 val; @@ -1231,6 +1233,7 @@ static void intel_psr_enable_locked(struct intel_dp *intel_dp, intel_dp_compute_psr_vsc_sdp(intel_dp, crtc_state, conn_state, &intel_dp->psr.vsc); intel_write_dp_vsc_sdp(encoder, crtc_state, &intel_dp->psr.vsc); + intel_snps_phy_update_psr_power_state(dev_priv, phy, true); intel_psr_enable_sink(intel_dp); intel_psr_enable_source(intel_dp); intel_dp->psr.enabled = true; @@ -1327,6 +1330,8 @@ static void intel_psr_wait_exit_locked(struct intel_dp *intel_dp) static void intel_psr_disable_locked(struct intel_dp *intel_dp) { struct drm_i915_private *dev_priv = dp_to_i915(intel_dp); + enum phy phy = intel_port_to_phy(dev_priv, +dp_to_dig_port(intel_dp)->base.port); lockdep_assert_held(&intel_dp->psr.lock); @@ -1353,6 +1358,8 @@ static void intel_psr_disable_locked(struct intel_dp *intel_dp) TRANS_SET_CONTEXT_LATENCY(intel_dp->psr.transcoder), TRANS_SET_CONTEXT_LATENCY_MASK, 0); + intel_snps_phy_update_psr_power_state(dev_priv, phy, false); + /* Disable PSR on Sink */ drm_dp_dpcd_writeb(&intel_dp->aux, DP_PSR_EN_CFG, 0); diff --git a/drivers/gpu/drm/i915/display/intel_snps_phy.c b/drivers/gpu/drm/i915/display/intel_snps_phy.c index f0c30d3d2dfb..18b52b64af95 100644 --- a/drivers/gpu/drm/i915/display/intel_snps_phy.c +++ b/drivers/gpu/drm/i915/display/intel_snps_phy.c @@ -36,6 +36,20 @@ void intel_snps_phy_wait_for_calibration(struct drm_i915_private *dev_priv) } } +void intel_snps_phy_update_psr_power_state(struct drm_i915_private *dev_priv, + enum phy phy, bool enable) +{ + u32 val; + + if (!intel_phy_is_snps(dev_priv, phy)) + return; + + val = REG_FIELD_PREP(SNPS_PHY_TX_REQ_LN_DIS_PWR_STATE_PSR, +enable ? 2 : 3); + intel_uncore_rmw(&dev_priv->uncore, SNPS_PHY_TX_REQ(phy), +SNPS_PHY_TX_REQ_LN_DIS_PWR_STATE_PSR, val); +} + static const u32 dg2_ddi_translations[] = { /* VS 0, pre-emph 0 */ REG_FIELD_PREP(SNPS_PHY_TX_EQ_MAIN, 26), diff --git a/drivers/gpu/drm/i915/display/intel_snps_phy.h b/drivers/gpu/drm/i915/display/intel_snps_phy.h index 6aa33ff729ec..6261ff88ef5c 100644 --- a/drivers/gpu/drm/i915/display/intel_snps_phy.h +++ b/drivers/gpu/drm/i915/display/intel_snps_phy.h @@ -12,8 +12,11 @@ struct drm_i915_private; struct intel_encoder; struct intel_crtc_state; struct intel_mpllb_state; +enum phy; void intel_snps_phy_wait_for_calibration(struct drm_i915_private *dev_priv); +void intel_snps_phy_update_psr_power_state(struct drm_i915_private *dev_priv, + enum phy phy, bool enable); int intel_mpllb_calc_state(struct intel_crtc_state *crtc_state, struct intel_encoder *encoder); diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index e3a165eb4fb6..9c5426aeddff 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -2340,6 +2340,9 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg) #define SNPS_PHY_REF_CONTROL(phy) _MMIO_SNPS(phy, 0x168188) #define SNPS_PHY_REF_CONTROL_REF_RANGE REG_GENMASK(31, 27) +#define SNPS_PHY_TX_REQ(phy) _MMIO_SNPS(phy, 0x168200) +#define SNPS_PHY_TX_REQ_LN_DIS_PWR_STATE_PSR REG_GENMASK(31, 30) + #define SNPS_PHY_TX_EQ(ln, phy)_MMIO_SNPS_LN(ln, phy, 0x168300
[PATCH 53/53] drm/i915/dg2: Configure PCON in DP pre-enable path
From: Ankit Nautiyal Add the functions to configure HDMI2.1 pcon for DG2, before DP link training. Signed-off-by: Ankit Nautiyal Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/display/intel_ddi.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/i915/display/intel_ddi.c b/drivers/gpu/drm/i915/display/intel_ddi.c index 5499a2975a0e..77f79f3269a1 100644 --- a/drivers/gpu/drm/i915/display/intel_ddi.c +++ b/drivers/gpu/drm/i915/display/intel_ddi.c @@ -2580,6 +2580,7 @@ static void dg2_ddi_pre_enable_dp(struct intel_atomic_state *state, if (!is_mst) intel_dp_set_power(intel_dp, DP_SET_POWER_D0); + intel_dp_configure_protocol_converter(intel_dp, crtc_state); intel_dp_sink_set_decompression_state(intel_dp, crtc_state, true); /* * DDI FEC: "anticipates enabling FEC encoding sets the FEC_READY bit @@ -2587,6 +2588,8 @@ static void dg2_ddi_pre_enable_dp(struct intel_atomic_state *state, * training */ intel_dp_sink_set_fec_ready(intel_dp, crtc_state); + intel_dp_check_frl_training(intel_dp); + intel_dp_pcon_dsc_configure(intel_dp, crtc_state); /* * 5.h Follow DisplayPort specification training sequence (see notes for -- 2.25.4
[PATCH 47/53] drm/i915/dg2: Wait for SNPS PHY calibration during display init
Initialization of the PHY is handled by the hardware/firmware, but the driver should wait up to 25ms for the PHY to report that its calibration has completed. Bspec: 49189 Bspec: 50107 Cc: Matt Atwood Signed-off-by: Matt Roper --- .../gpu/drm/i915/display/intel_display_power.c| 5 + drivers/gpu/drm/i915/display/intel_snps_phy.c | 15 +++ drivers/gpu/drm/i915/display/intel_snps_phy.h | 3 +++ drivers/gpu/drm/i915/i915_reg.h | 1 + 4 files changed, 24 insertions(+) diff --git a/drivers/gpu/drm/i915/display/intel_display_power.c b/drivers/gpu/drm/i915/display/intel_display_power.c index df6358638fee..83bc2e691560 100644 --- a/drivers/gpu/drm/i915/display/intel_display_power.c +++ b/drivers/gpu/drm/i915/display/intel_display_power.c @@ -18,6 +18,7 @@ #include "intel_pm.h" #include "intel_pps.h" #include "intel_sideband.h" +#include "intel_snps_phy.h" #include "intel_tc.h" #include "intel_vga.h" @@ -5899,6 +5900,10 @@ static void icl_display_core_init(struct drm_i915_private *dev_priv, if (DISPLAY_VER(dev_priv) >= 12) tgl_bw_buddy_init(dev_priv); + /* 8. Ensure PHYs have completed calibration and adaptation */ + if (IS_DG2(dev_priv)) + intel_snps_phy_wait_for_calibration(dev_priv); + if (resume && intel_dmc_has_payload(dev_priv)) intel_dmc_load_program(dev_priv); diff --git a/drivers/gpu/drm/i915/display/intel_snps_phy.c b/drivers/gpu/drm/i915/display/intel_snps_phy.c index 77759bda98a4..f0c30d3d2dfb 100644 --- a/drivers/gpu/drm/i915/display/intel_snps_phy.c +++ b/drivers/gpu/drm/i915/display/intel_snps_phy.c @@ -21,6 +21,21 @@ * since it is not handled by the shared DPLL framework as on other platforms. */ +void intel_snps_phy_wait_for_calibration(struct drm_i915_private *dev_priv) +{ + enum phy phy; + + for_each_phy_masked(phy, ~0) { + if (!intel_phy_is_snps(dev_priv, phy)) + continue; + + if (intel_de_wait_for_clear(dev_priv, ICL_PHY_MISC(phy), + DG2_PHY_DP_TX_ACK_MASK, 25)) + DRM_ERROR("SNPS PHY %c failed to calibrate after 25ms.\n", + phy); + } +} + static const u32 dg2_ddi_translations[] = { /* VS 0, pre-emph 0 */ REG_FIELD_PREP(SNPS_PHY_TX_EQ_MAIN, 26), diff --git a/drivers/gpu/drm/i915/display/intel_snps_phy.h b/drivers/gpu/drm/i915/display/intel_snps_phy.h index 3ce92d424f66..6aa33ff729ec 100644 --- a/drivers/gpu/drm/i915/display/intel_snps_phy.h +++ b/drivers/gpu/drm/i915/display/intel_snps_phy.h @@ -8,10 +8,13 @@ #include +struct drm_i915_private; struct intel_encoder; struct intel_crtc_state; struct intel_mpllb_state; +void intel_snps_phy_wait_for_calibration(struct drm_i915_private *dev_priv); + int intel_mpllb_calc_state(struct intel_crtc_state *crtc_state, struct intel_encoder *encoder); void intel_mpllb_enable(struct intel_encoder *encoder, diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 203056b9f02c..e3a165eb4fb6 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -12442,6 +12442,7 @@ enum skl_power_gate { _ICL_PHY_MISC_B) #define ICL_PHY_MISC_MUX_DDID (1 << 28) #define ICL_PHY_MISC_DE_IO_COMP_PWR_DOWN (1 << 23) +#define DG2_PHY_DP_TX_ACK_MASKREG_GENMASK(23, 20) /* Icelake Display Stream Compression Registers */ #define DSCA_PICTURE_PARAMETER_SET_0 _MMIO(0x6B200) -- 2.25.4
[PATCH 04/53] drm/i915/xehp: VDBOX/VEBOX fusing registers are enable-based
From: Tvrtko Ursulin On Xe_HP the fusing register is renamed and changed to have the "enable" semantics, but otherwise remains compatible (mmio address, bitmask ranges) with older platforms. To simplify things we do not add a new register definition but just stop inverting the fusing masks before processing them. Bspec: 33288 Cc: Daniele Ceraolo Spurio Signed-off-by: Tvrtko Ursulin Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/gt/intel_engine_cs.c | 9 - 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c index 88694822716a..151870d8fdd3 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -468,7 +468,14 @@ static intel_engine_mask_t init_engine_mask(struct intel_gt *gt) if (GRAPHICS_VER(i915) < 11) return info->engine_mask; - media_fuse = ~intel_uncore_read(uncore, GEN11_GT_VEBOX_VDBOX_DISABLE); + /* +* On newer platforms the fusing register is called 'enable' and has +* enable semantics, while on older platforms it is called 'disable' +* and bits have disable semantices. +*/ + media_fuse = intel_uncore_read(uncore, GEN11_GT_VEBOX_VDBOX_DISABLE); + if (GRAPHICS_VER_FULL(i915) < IP_VER(12, 50)) + media_fuse = ~media_fuse; vdbox_mask = media_fuse & GEN11_GT_VDBOX_DISABLE_MASK; vebox_mask = (media_fuse & GEN11_GT_VEBOX_DISABLE_MASK) >> -- 2.25.4
[PATCH 46/53] drm/i915/dg2: Classify DG2 PHY types
Although the bspec labels four of DG2's outputs as "combo PHY," the underlying PHYs in both cases are actually Synopsys PHYs that are programmed completely differently than the traditional Intel "combo" PHY units. As such, we don't want intel_phy_is_combo to take us down legacy programming paths, so just return false from it on DG2. Instead add a new intel_phy_is_snps() that will return true for all DG2 PHYs. Cc: Anusha Srivatsa Cc: Matt Atwood Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/display/intel_display.c | 26 +++- drivers/gpu/drm/i915/display/intel_display.h | 1 + 2 files changed, 26 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c index cce520b6dfcf..9655f1b1b41b 100644 --- a/drivers/gpu/drm/i915/display/intel_display.c +++ b/drivers/gpu/drm/i915/display/intel_display.c @@ -3698,6 +3698,13 @@ bool intel_phy_is_combo(struct drm_i915_private *dev_priv, enum phy phy) { if (phy == PHY_NONE) return false; + else if (IS_DG2(dev_priv)) + /* +* DG2 outputs labelled as "combo PHY" in the bspec use +* SNPS PHYs with completely different programming, +* hence we always return false here. +*/ + return false; else if (IS_ALDERLAKE_S(dev_priv)) return phy <= PHY_E; else if (IS_DG1(dev_priv) || IS_ROCKETLAKE(dev_priv)) @@ -3712,7 +3719,10 @@ bool intel_phy_is_combo(struct drm_i915_private *dev_priv, enum phy phy) bool intel_phy_is_tc(struct drm_i915_private *dev_priv, enum phy phy) { - if (IS_ALDERLAKE_P(dev_priv)) + if (IS_DG2(dev_priv)) + /* DG2's "TC1" output uses a SNPS PHY */ + return false; + else if (IS_ALDERLAKE_P(dev_priv)) return phy >= PHY_F && phy <= PHY_I; else if (IS_TIGERLAKE(dev_priv)) return phy >= PHY_D && phy <= PHY_I; @@ -3722,6 +3732,20 @@ bool intel_phy_is_tc(struct drm_i915_private *dev_priv, enum phy phy) return false; } +bool intel_phy_is_snps(struct drm_i915_private *dev_priv, enum phy phy) +{ + if (phy == PHY_NONE) + return false; + else if (IS_DG2(dev_priv)) + /* +* All four "combo" ports and the TC1 port (PHY E) use +* Synopsis PHYs. +*/ + return phy <= PHY_E; + + return false; +} + enum phy intel_port_to_phy(struct drm_i915_private *i915, enum port port) { if (DISPLAY_VER(i915) >= 13 && port >= PORT_D_XELPD) diff --git a/drivers/gpu/drm/i915/display/intel_display.h b/drivers/gpu/drm/i915/display/intel_display.h index c9dbaf074d77..284936f0ddab 100644 --- a/drivers/gpu/drm/i915/display/intel_display.h +++ b/drivers/gpu/drm/i915/display/intel_display.h @@ -561,6 +561,7 @@ struct drm_display_mode * intel_encoder_current_mode(struct intel_encoder *encoder); bool intel_phy_is_combo(struct drm_i915_private *dev_priv, enum phy phy); bool intel_phy_is_tc(struct drm_i915_private *dev_priv, enum phy phy); +bool intel_phy_is_snps(struct drm_i915_private *dev_priv, enum phy phy); enum tc_port intel_port_to_tc(struct drm_i915_private *dev_priv, enum port port); int intel_get_pipe_from_crtc_id_ioctl(struct drm_device *dev, void *data, -- 2.25.4
[PATCH 43/53] drm/i915/dg2: Add MPLLB programming for HDMI
At the moment we don't have a proper algorithm that can be used to calculate PHY settings for arbitrary HDMI link rates. The PHY tables here should support the regular modes of real-world HDMI monitors. Bspec: 54032 Cc: Matt Atwood Signed-off-by: Matt Roper Signed-off-by: Vandita Kulkarni --- drivers/gpu/drm/i915/display/intel_ddi.c | 14 +- drivers/gpu/drm/i915/display/intel_display.c | 47 +++ drivers/gpu/drm/i915/display/intel_hdmi.c | 11 + drivers/gpu/drm/i915/display/intel_snps_phy.c | 286 +- drivers/gpu/drm/i915/display/intel_snps_phy.h | 7 + drivers/gpu/drm/i915/i915_reg.h | 3 + 6 files changed, 355 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/i915/display/intel_ddi.c b/drivers/gpu/drm/i915/display/intel_ddi.c index 26a3aa73fcc4..929a95ddb316 100644 --- a/drivers/gpu/drm/i915/display/intel_ddi.c +++ b/drivers/gpu/drm/i915/display/intel_ddi.c @@ -51,6 +51,7 @@ #include "intel_panel.h" #include "intel_pps.h" #include "intel_psr.h" +#include "intel_snps_phy.h" #include "intel_sprite.h" #include "intel_tc.h" #include "intel_vdsc.h" @@ -3745,6 +3746,15 @@ void intel_ddi_get_clock(struct intel_encoder *encoder, &crtc_state->dpll_hw_state); } +static void dg2_ddi_get_config(struct intel_encoder *encoder, + struct intel_crtc_state *crtc_state) +{ + intel_mpllb_readout_hw_state(encoder, &crtc_state->mpllb_state); + crtc_state->port_clock = intel_mpllb_calc_port_clock(encoder, &crtc_state->mpllb_state); + + intel_ddi_get_config(encoder, crtc_state); +} + static void adls_ddi_get_config(struct intel_encoder *encoder, struct intel_crtc_state *crtc_state) { @@ -4606,7 +4616,9 @@ void intel_ddi_init(struct drm_i915_private *dev_priv, enum port port) encoder->cloneable = 0; encoder->pipe_mask = ~0; - if (IS_ALDERLAKE_S(dev_priv)) { + if (IS_DG2(dev_priv)) { + encoder->get_config = dg2_ddi_get_config; + } else if (IS_ALDERLAKE_S(dev_priv)) { encoder->enable_clock = adls_ddi_enable_clock; encoder->disable_clock = adls_ddi_disable_clock; encoder->is_clock_enabled = adls_ddi_is_clock_enabled; diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c index 91f6964ec406..cce520b6dfcf 100644 --- a/drivers/gpu/drm/i915/display/intel_display.c +++ b/drivers/gpu/drm/i915/display/intel_display.c @@ -9113,6 +9113,52 @@ verify_shared_dpll_state(struct intel_crtc *crtc, } } +static void +verify_mpllb_state(struct intel_atomic_state *state, + struct intel_crtc_state *new_crtc_state) +{ + struct drm_i915_private *i915 = to_i915(state->base.dev); + struct intel_mpllb_state mpllb_hw_state = { 0 }; + struct intel_mpllb_state *mpllb_sw_state = &new_crtc_state->mpllb_state; + struct intel_crtc *crtc = to_intel_crtc(new_crtc_state->uapi.crtc); + struct intel_encoder *encoder; + + if (!IS_DG2(i915)) + return; + + if (!new_crtc_state->hw.active) + return; + + encoder = intel_get_crtc_new_encoder(state, new_crtc_state); + intel_mpllb_readout_hw_state(encoder, &mpllb_hw_state); + +#define MPLLB_CHECK(name) do { \ + if (mpllb_sw_state->name != mpllb_hw_state.name) { \ + pipe_config_mismatch(false, crtc, "MPLLB:" __stringify(name), \ +"(expected 0x%08x, found 0x%08x)", \ +mpllb_sw_state->name, \ +mpllb_hw_state.name); \ + } \ +} while (0) + + MPLLB_CHECK(mpllb_cp); + MPLLB_CHECK(mpllb_div); + MPLLB_CHECK(mpllb_div2); + MPLLB_CHECK(mpllb_fracn1); + MPLLB_CHECK(mpllb_fracn2); + MPLLB_CHECK(mpllb_sscen); + MPLLB_CHECK(mpllb_sscstep); + + /* +* ref_control is handled by the hardware/firemware and never +* programmed by the software, but the proper values are supplied +* in the bspec for verification purposes. +*/ + MPLLB_CHECK(ref_control); + +#undef MPLLB_CHECK +} + static void intel_modeset_verify_crtc(struct intel_crtc *crtc, struct intel_atomic_state *state, @@ -9126,6 +9172,7 @@ intel_modeset_verify_crtc(struct intel_crtc *crtc, verify_connector_state(state, crtc); verify_crtc_state(crtc, old_crtc_state, new_crtc_state); verify_shared_dpll_state(crtc, old_crtc_state, new_crtc_state); + verify_mpllb_state(state, new_crtc_state); } static void diff --git a/drivers/gpu/drm/i915/display/intel_hdmi.c b/drivers/gpu/drm/i915/display/intel_hdmi.c index 852af2b23540..b04685bb6439 100644 --- a/drivers/gpu/drm/i915/display/intel_hdmi.c +++ b/drivers/gpu/drm/i915/display/intel_hdmi.c @@ -51,6 +51,7 @@ #inc
[PATCH 29/53] drm/i915/dg2: Add new LRI reg offsets
From: Akeem G Abodunrin New LRI register offsets were introduced for DG2, this patch adds those extra registers, and create new register table for setting offsets to compare with HW generated context image - especially for gt_lrc test. Also updates general purpose register with scratch offset for DG2, in order to use it for live_lrc_fixed selftest. Cc: Chris P Wilson Cc: Prathap Kumar Valsan Signed-off-by: Akeem G Abodunrin Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/gt/intel_lrc.c | 85 - 1 file changed, 83 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index fee735e2a524..da7ac1d970af 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -226,6 +226,40 @@ static const u8 gen12_xcs_offsets[] = { END }; +static const u8 dg2_xcs_offsets[] = { + NOP(1), + LRI(15, POSTED), + REG16(0x244), + REG(0x034), + REG(0x030), + REG(0x038), + REG(0x03c), + REG(0x168), + REG(0x140), + REG(0x110), + REG(0x1c0), + REG(0x1c4), + REG(0x1c8), + REG(0x180), + REG16(0x2b4), + REG(0x120), + REG(0x124), + + NOP(1), + LRI(9, POSTED), + REG16(0x3a8), + REG16(0x28c), + REG16(0x288), + REG16(0x284), + REG16(0x280), + REG16(0x27c), + REG16(0x278), + REG16(0x274), + REG16(0x270), + + END +}; + static const u8 gen8_rcs_offsets[] = { NOP(1), LRI(14, POSTED), @@ -525,6 +559,49 @@ static const u8 xehp_rcs_offsets[] = { END }; +static const u8 dg2_rcs_offsets[] = { + NOP(1), + LRI(15, POSTED), + REG16(0x244), + REG(0x034), + REG(0x030), + REG(0x038), + REG(0x03c), + REG(0x168), + REG(0x140), + REG(0x110), + REG(0x1c0), + REG(0x1c4), + REG(0x1c8), + REG(0x180), + REG16(0x2b4), + REG(0x120), + REG(0x124), + + NOP(1), + LRI(9, POSTED), + REG16(0x3a8), + REG16(0x28c), + REG16(0x288), + REG16(0x284), + REG16(0x280), + REG16(0x27c), + REG16(0x278), + REG16(0x274), + REG16(0x270), + + LRI(3, POSTED), + REG(0x1b0), + REG16(0x5a8), + REG16(0x5ac), + + NOP(6), + LRI(1, 0), + REG(0x0c8), + + END +}; + #undef END #undef REG16 #undef REG @@ -543,7 +620,9 @@ static const u8 *reg_offsets(const struct intel_engine_cs *engine) !intel_engine_has_relative_mmio(engine)); if (engine->class == RENDER_CLASS) { - if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 50)) + if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 55)) + return dg2_rcs_offsets; + else if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 50)) return xehp_rcs_offsets; else if (GRAPHICS_VER(engine->i915) >= 12) return gen12_rcs_offsets; @@ -554,7 +633,9 @@ static const u8 *reg_offsets(const struct intel_engine_cs *engine) else return gen8_rcs_offsets; } else { - if (GRAPHICS_VER(engine->i915) >= 12) + if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 55)) + return dg2_xcs_offsets; + else if (GRAPHICS_VER(engine->i915) >= 12) return gen12_xcs_offsets; else if (GRAPHICS_VER(engine->i915) >= 9) return gen9_xcs_offsets; -- 2.25.4
[PATCH 44/53] drm/i915/dg2: Add vswing programming for SNPS phys
Vswing programming for SNPS PHYs is just a single step -- look up the value that corresponds to the voltage level from a table and program it into the SNPS_PHY_TX_EQ register. Bspec: 53920 Cc: Matt Atwood Signed-off-by: Matt Roper Signed-off-by: Jani Nikula --- drivers/gpu/drm/i915/display/intel_ddi.c | 23 ++-- drivers/gpu/drm/i915/display/intel_snps_phy.c | 54 +++ drivers/gpu/drm/i915/display/intel_snps_phy.h | 4 ++ drivers/gpu/drm/i915/i915_reg.h | 5 ++ 4 files changed, 83 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/display/intel_ddi.c b/drivers/gpu/drm/i915/display/intel_ddi.c index 929a95ddb316..ade03cf41caa 100644 --- a/drivers/gpu/drm/i915/display/intel_ddi.c +++ b/drivers/gpu/drm/i915/display/intel_ddi.c @@ -1496,6 +1496,16 @@ static int intel_ddi_dp_level(struct intel_dp *intel_dp) return translate_signal_level(intel_dp, signal_levels); } +static void +dg2_set_signal_levels(struct intel_dp *intel_dp, + const struct intel_crtc_state *crtc_state) +{ + struct intel_encoder *encoder = &dp_to_dig_port(intel_dp)->base; + int level = intel_ddi_dp_level(intel_dp); + + intel_snps_phy_ddi_vswing_sequence(encoder, level); +} + static void tgl_set_signal_levels(struct intel_dp *intel_dp, const struct intel_crtc_state *crtc_state) @@ -2563,7 +2573,10 @@ static void tgl_ddi_pre_enable_dp(struct intel_atomic_state *state, */ /* 7.e Configure voltage swing and related IO settings */ - tgl_ddi_vswing_sequence(encoder, crtc_state, level); + if (IS_DG2(dev_priv)) + intel_snps_phy_ddi_vswing_sequence(encoder, level); + else + tgl_ddi_vswing_sequence(encoder, crtc_state, level); /* * 7.f Combo PHY: Configure PORT_CL_DW10 Static Power Down to power up @@ -3102,7 +3115,9 @@ static void intel_enable_ddi_hdmi(struct intel_atomic_state *state, "[CONNECTOR:%d:%s] Failed to configure sink scrambling/TMDS bit clock ratio\n", connector->base.id, connector->name); - if (DISPLAY_VER(dev_priv) >= 12) + if (IS_DG2(dev_priv)) + intel_snps_phy_ddi_vswing_sequence(encoder, U32_MAX); + else if (DISPLAY_VER(dev_priv) >= 12) tgl_ddi_vswing_sequence(encoder, crtc_state, level); else if (DISPLAY_VER(dev_priv) == 11) icl_ddi_vswing_sequence(encoder, crtc_state, level); @@ -4075,7 +4090,9 @@ intel_ddi_init_dp_connector(struct intel_digital_port *dig_port) dig_port->dp.set_link_train = intel_ddi_set_link_train; dig_port->dp.set_idle_link_train = intel_ddi_set_idle_link_train; - if (DISPLAY_VER(dev_priv) >= 12) + if (IS_DG2(dev_priv)) + dig_port->dp.set_signal_levels = dg2_set_signal_levels; + else if (DISPLAY_VER(dev_priv) >= 12) dig_port->dp.set_signal_levels = tgl_set_signal_levels; else if (DISPLAY_VER(dev_priv) >= 11) dig_port->dp.set_signal_levels = icl_set_signal_levels; diff --git a/drivers/gpu/drm/i915/display/intel_snps_phy.c b/drivers/gpu/drm/i915/display/intel_snps_phy.c index 1317b4e94b50..77759bda98a4 100644 --- a/drivers/gpu/drm/i915/display/intel_snps_phy.c +++ b/drivers/gpu/drm/i915/display/intel_snps_phy.c @@ -21,6 +21,60 @@ * since it is not handled by the shared DPLL framework as on other platforms. */ +static const u32 dg2_ddi_translations[] = { + /* VS 0, pre-emph 0 */ + REG_FIELD_PREP(SNPS_PHY_TX_EQ_MAIN, 26), + + /* VS 0, pre-emph 1 */ + REG_FIELD_PREP(SNPS_PHY_TX_EQ_MAIN, 33) | + REG_FIELD_PREP(SNPS_PHY_TX_EQ_POST, 6), + + /* VS 0, pre-emph 2 */ + REG_FIELD_PREP(SNPS_PHY_TX_EQ_MAIN, 38) | + REG_FIELD_PREP(SNPS_PHY_TX_EQ_POST, 12), + + /* VS 0, pre-emph 3 */ + REG_FIELD_PREP(SNPS_PHY_TX_EQ_MAIN, 43) | + REG_FIELD_PREP(SNPS_PHY_TX_EQ_POST, 19), + + /* VS 1, pre-emph 0 */ + REG_FIELD_PREP(SNPS_PHY_TX_EQ_MAIN, 39), + + /* VS 1, pre-emph 1 */ + REG_FIELD_PREP(SNPS_PHY_TX_EQ_MAIN, 44) | + REG_FIELD_PREP(SNPS_PHY_TX_EQ_POST, 8), + + /* VS 1, pre-emph 2 */ + REG_FIELD_PREP(SNPS_PHY_TX_EQ_MAIN, 47) | + REG_FIELD_PREP(SNPS_PHY_TX_EQ_POST, 15), + + /* VS 2, pre-emph 0 */ + REG_FIELD_PREP(SNPS_PHY_TX_EQ_MAIN, 52), + + /* VS 2, pre-emph 1 */ + REG_FIELD_PREP(SNPS_PHY_TX_EQ_MAIN, 51) | + REG_FIELD_PREP(SNPS_PHY_TX_EQ_POST, 10), + + /* VS 3, pre-emph 0 */ + REG_FIELD_PREP(SNPS_PHY_TX_EQ_MAIN, 62), +}; + +void intel_snps_phy_ddi_vswing_sequence(struct intel_encoder *encoder, + u32 level) +{ + struct drm_i915_private *dev_priv = to_i915(encoder->base.dev); + enum phy phy = intel_port_to_phy(dev_priv, encoder->port); + int n_entri
[PATCH 45/53] drm/i915/dg2: Update modeset sequences
DG2 has some changes to the expected modesetting sequences when compared to gen12. Adjust our driver logic accordingly. Although the DP sequence is pretty similar to TGL's, there are some steps that change, so let's split the handling for that out into a separate function. Bspec: 54128 Cc: Lucas De Marchi Cc: Anusha Srivatsa Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/display/intel_ddi.c | 135 +-- 1 file changed, 127 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/i915/display/intel_ddi.c b/drivers/gpu/drm/i915/display/intel_ddi.c index ade03cf41caa..5499a2975a0e 100644 --- a/drivers/gpu/drm/i915/display/intel_ddi.c +++ b/drivers/gpu/drm/i915/display/intel_ddi.c @@ -172,14 +172,22 @@ void intel_wait_ddi_buf_idle(struct drm_i915_private *dev_priv, static void intel_wait_ddi_buf_active(struct drm_i915_private *dev_priv, enum port port) { + int ret; + /* Wait > 518 usecs for DDI_BUF_CTL to be non idle */ if (DISPLAY_VER(dev_priv) < 10) { usleep_range(518, 1000); return; } - if (wait_for_us(!(intel_de_read(dev_priv, DDI_BUF_CTL(port)) & - DDI_BUF_IS_IDLE), 500)) + if (IS_DG2(dev_priv)) + ret = wait_for_us(!(intel_de_read(dev_priv, DDI_BUF_CTL(port)) & + DDI_BUF_IS_IDLE), 1200); + else + ret = wait_for_us(!(intel_de_read(dev_priv, DDI_BUF_CTL(port)) & + DDI_BUF_IS_IDLE), 500); + + if (ret) drm_err(&dev_priv->drm, "Timeout waiting for DDI BUF %c to get active\n", port_name(port)); } @@ -2207,7 +2215,7 @@ void intel_ddi_sanitize_encoder_pll_mapping(struct intel_encoder *encoder) ddi_clk_needed = false; } - if (ddi_clk_needed || !encoder->disable_clock || + if (ddi_clk_needed || !encoder->is_clock_enabled || !encoder->is_clock_enabled(encoder)) return; @@ -2488,6 +2496,116 @@ static void intel_ddi_mso_configure(const struct intel_crtc_state *crtc_state) OVERLAP_PIXELS_MASK, dss1); } +static void dg2_ddi_pre_enable_dp(struct intel_atomic_state *state, + struct intel_encoder *encoder, + const struct intel_crtc_state *crtc_state, + const struct drm_connector_state *conn_state) +{ + struct intel_dp *intel_dp = enc_to_intel_dp(encoder); + struct drm_i915_private *dev_priv = to_i915(encoder->base.dev); + enum phy phy = intel_port_to_phy(dev_priv, encoder->port); + struct intel_digital_port *dig_port = enc_to_dig_port(encoder); + bool is_mst = intel_crtc_has_type(crtc_state, INTEL_OUTPUT_DP_MST); + int level = intel_ddi_dp_level(intel_dp); + + intel_dp_set_link_params(intel_dp, crtc_state->port_clock, +crtc_state->lane_count); + + /* +* 1. Enable Power Wells +* +* This was handled at the beginning of intel_atomic_commit_tail(), +* before we called down into this function. +*/ + + /* 2. Enable Panel Power if PPS is required */ + intel_pps_on(intel_dp); + + /* +* 3. Enable the port PLL. +*/ + intel_ddi_enable_clock(encoder, crtc_state); + + /* 4. Enable IO power */ + if (!intel_phy_is_tc(dev_priv, phy) || + dig_port->tc_mode != TC_PORT_TBT_ALT) + dig_port->ddi_io_wakeref = intel_display_power_get(dev_priv, + dig_port->ddi_io_power_domain); + + /* +* 5. The rest of the below are substeps under the bspec's "Enable and +* Train Display Port" step. Note that steps that are specific to +* MST will be handled by intel_mst_pre_enable_dp() before/after it +* calls into this function. Also intel_mst_pre_enable_dp() only calls +* us when active_mst_links==0, so any steps designated for "single +* stream or multi-stream master transcoder" can just be performed +* unconditionally here. +*/ + + /* +* 5.a Configure Transcoder Clock Select to direct the Port clock to the +* Transcoder. +*/ + intel_ddi_enable_pipe_clock(encoder, crtc_state); + + /* 5.b Not relevant to i915 for now */ + + /* +* 5.c Configure TRANS_DDI_FUNC_CTL DDI Select, DDI Mode Select & MST +* Transport Select +*/ + intel_ddi_config_transcoder_func(encoder, crtc_state); + + /* +* 5.d Configure & enable DP_TP_CTL with link training pattern 1 +* selected +* +* This will be handled by the intel_dp_start_link_train() farther +* down this function. +*/ + + /* 5.e Configure voltage swing and re
[PATCH 07/53] drm/i915/xehp: Extra media engines - Part 1 (engine definitions)
From: John Harrison Xe_HP can have a lot of extra media engines. This patch adds the basic definitions for them. Cc: Tvrtko Ursulin Signed-off-by: John Harrison Signed-off-by: Tomas Winkler Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 7 ++- drivers/gpu/drm/i915/gt/intel_engine_cs.c| 50 drivers/gpu/drm/i915/gt/intel_engine_types.h | 14 -- drivers/gpu/drm/i915/i915_reg.h | 6 +++ 4 files changed, 69 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c index 87b06572fd2e..35edc55720f4 100644 --- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c @@ -279,7 +279,7 @@ int gen12_emit_flush_xcs(struct i915_request *rq, u32 mode) if (mode & EMIT_INVALIDATE) aux_inv = rq->engine->mask & ~BIT(BCS0); if (aux_inv) - cmd += 2 * hweight8(aux_inv) + 2; + cmd += 2 * hweight32(aux_inv) + 2; cs = intel_ring_begin(rq, cmd); if (IS_ERR(cs)) @@ -313,9 +313,8 @@ int gen12_emit_flush_xcs(struct i915_request *rq, u32 mode) struct intel_engine_cs *engine; unsigned int tmp; - *cs++ = MI_LOAD_REGISTER_IMM(hweight8(aux_inv)); - for_each_engine_masked(engine, rq->engine->gt, - aux_inv, tmp) { + *cs++ = MI_LOAD_REGISTER_IMM(hweight32(aux_inv)); + for_each_engine_masked(engine, rq->engine->gt, aux_inv, tmp) { *cs++ = i915_mmio_reg_offset(aux_inv_reg(engine)); *cs++ = AUX_INV; } diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c index 4ab2c9abb943..6e2aa1acc4d4 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -104,6 +104,38 @@ static const struct engine_info intel_engines[] = { { .graphics_ver = 11, .base = GEN11_BSD4_RING_BASE } }, }, + [VCS4] = { + .hw_id = 0, /* not used in GEN12+, see MI_SEMAPHORE_SIGNAL */ + .class = VIDEO_DECODE_CLASS, + .instance = 4, + .mmio_bases = { + { .graphics_ver = 11, .base = XEHP_BSD5_RING_BASE } + }, + }, + [VCS5] = { + .hw_id = 0, /* not used in GEN12+, see MI_SEMAPHORE_SIGNAL */ + .class = VIDEO_DECODE_CLASS, + .instance = 5, + .mmio_bases = { + { .graphics_ver = 12, .base = XEHP_BSD6_RING_BASE } + }, + }, + [VCS6] = { + .hw_id = 0, /* not used in GEN12+, see MI_SEMAPHORE_SIGNAL */ + .class = VIDEO_DECODE_CLASS, + .instance = 6, + .mmio_bases = { + { .graphics_ver = 12, .base = XEHP_BSD7_RING_BASE } + }, + }, + [VCS7] = { + .hw_id = 0, /* not used in GEN12+, see MI_SEMAPHORE_SIGNAL */ + .class = VIDEO_DECODE_CLASS, + .instance = 7, + .mmio_bases = { + { .graphics_ver = 12, .base = XEHP_BSD8_RING_BASE } + }, + }, [VECS0] = { .hw_id = VECS0_HW, .class = VIDEO_ENHANCEMENT_CLASS, @@ -121,6 +153,22 @@ static const struct engine_info intel_engines[] = { { .graphics_ver = 11, .base = GEN11_VEBOX2_RING_BASE } }, }, + [VECS2] = { + .hw_id = 0, /* not used in GEN12+, see MI_SEMAPHORE_SIGNAL */ + .class = VIDEO_ENHANCEMENT_CLASS, + .instance = 2, + .mmio_bases = { + { .graphics_ver = 12, .base = XEHP_VEBOX3_RING_BASE } + }, + }, + [VECS3] = { + .hw_id = 0, /* not used in GEN12+, see MI_SEMAPHORE_SIGNAL */ + .class = VIDEO_ENHANCEMENT_CLASS, + .instance = 3, + .mmio_bases = { + { .graphics_ver = 12, .base = XEHP_VEBOX4_RING_BASE } + }, + }, }; /** @@ -269,6 +317,8 @@ static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id) BUILD_BUG_ON(MAX_ENGINE_CLASS >= BIT(GEN11_ENGINE_CLASS_WIDTH)); BUILD_BUG_ON(MAX_ENGINE_INSTANCE >= BIT(GEN11_ENGINE_INSTANCE_WIDTH)); + BUILD_BUG_ON(I915_MAX_VCS > (MAX_ENGINE_INSTANCE + 1)); + BUILD_BUG_ON(I915_MAX_VECS > (MAX_ENGINE_INSTANCE + 1)); if (GEM_DEBUG_WARN_ON(id >= ARRAY_SIZE(gt->engine))) return -EINVAL; diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h index 5b91068ab277..b25f594a7e4b 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_t
[PATCH 32/53] drm/i915/dg2: Define MOCS table for DG2
Bspec: 45101, 45427 Cc: Ramalingam C (v5) Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/gt/intel_mocs.c | 35 +++- 1 file changed, 34 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gt/intel_mocs.c b/drivers/gpu/drm/i915/gt/intel_mocs.c index 0c9d0b936c20..d22ca8212092 100644 --- a/drivers/gpu/drm/i915/gt/intel_mocs.c +++ b/drivers/gpu/drm/i915/gt/intel_mocs.c @@ -341,6 +341,30 @@ static const struct drm_i915_mocs_entry xehpsdv_mocs_table[] = { MOCS_ENTRY(63, 0, L3_1_UC), }; +static const struct drm_i915_mocs_entry dg2_mocs_table[] = { + /* UC - Coherent; GO:L3 */ + MOCS_ENTRY(0, 0, L3_1_UC | L3_LKUP(1)), + /* UC - Coherent; GO:Memory */ + MOCS_ENTRY(1, 0, L3_1_UC | L3_GLBGO(1) | L3_LKUP(1)), + /* UC - Non-Coherent; GO:Memory */ + MOCS_ENTRY(2, 0, L3_1_UC | L3_GLBGO(1)), + + /* WB - LC */ + MOCS_ENTRY(3, 0, L3_3_WB | L3_LKUP(1)), +}; + +static const struct drm_i915_mocs_entry dg2_mocs_table_g10_ax[] = { + /* Wa_14011441408: Set Go to Memory for MOCS#0 */ + MOCS_ENTRY(0, 0, L3_1_UC | L3_GLBGO(1) | L3_LKUP(1)), + /* UC - Coherent; GO:Memory */ + MOCS_ENTRY(1, 0, L3_1_UC | L3_GLBGO(1) | L3_LKUP(1)), + /* UC - Non-Coherent; GO:Memory */ + MOCS_ENTRY(2, 0, L3_1_UC | L3_GLBGO(1)), + + /* WB - LC */ + MOCS_ENTRY(3, 0, L3_3_WB | L3_LKUP(1)), +}; + enum { HAS_GLOBAL_MOCS = BIT(0), HAS_ENGINE_MOCS = BIT(1), @@ -367,7 +391,16 @@ static unsigned int get_mocs_settings(const struct drm_i915_private *i915, { unsigned int flags; - if (IS_XEHPSDV(i915)) { + if (IS_DG2(i915)) { + if (IS_DG2_GT_STEP(i915, G10, STEP_A0, (STEP_B0 - 1))) { + table->size = ARRAY_SIZE(dg2_mocs_table_g10_ax); + table->table = dg2_mocs_table_g10_ax; + } else { + table->size = ARRAY_SIZE(dg2_mocs_table); + table->table = dg2_mocs_table; + } + table->n_entries = GEN9_NUM_MOCS_ENTRIES; + } else if (IS_XEHPSDV(i915)) { table->size = ARRAY_SIZE(xehpsdv_mocs_table); table->table = xehpsdv_mocs_table; table->n_entries = GEN9_NUM_MOCS_ENTRIES; -- 2.25.4
[PATCH 22/53] drm/i915/xehpsdv: factor out function to read RP_STATE_CAP
From: Lucas De Marchi Instead of maintaining the same if ladder in 3 different places, add a function to read RP_STATE_CAP. Signed-off-by: Lucas De Marchi Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/gt/debugfs_gt_pm.c | 8 +++- drivers/gpu/drm/i915/gt/intel_rps.c | 17 - drivers/gpu/drm/i915/gt/intel_rps.h | 1 + drivers/gpu/drm/i915/i915_debugfs.c | 8 +++- 4 files changed, 19 insertions(+), 15 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/debugfs_gt_pm.c b/drivers/gpu/drm/i915/gt/debugfs_gt_pm.c index 4270b5a34a83..1061a62bdfce 100644 --- a/drivers/gpu/drm/i915/gt/debugfs_gt_pm.c +++ b/drivers/gpu/drm/i915/gt/debugfs_gt_pm.c @@ -309,13 +309,11 @@ static int frequency_show(struct seq_file *m, void *unused) int max_freq; rp_state_limits = intel_uncore_read(uncore, GEN6_RP_STATE_LIMITS); - if (IS_GEN9_LP(i915)) { - rp_state_cap = intel_uncore_read(uncore, BXT_RP_STATE_CAP); + rp_state_cap = intel_rps_read_state_cap(rps); + if (IS_GEN9_LP(i915)) gt_perf_status = intel_uncore_read(uncore, BXT_GT_PERF_STATUS); - } else { - rp_state_cap = intel_uncore_read(uncore, GEN6_RP_STATE_CAP); + else gt_perf_status = intel_uncore_read(uncore, GEN6_GT_PERF_STATUS); - } /* RPSTAT1 is in the GT power well */ intel_uncore_forcewake_get(uncore, FORCEWAKE_ALL); diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c b/drivers/gpu/drm/i915/gt/intel_rps.c index 06e9a8ed4e03..490bc1513480 100644 --- a/drivers/gpu/drm/i915/gt/intel_rps.c +++ b/drivers/gpu/drm/i915/gt/intel_rps.c @@ -975,20 +975,16 @@ int intel_rps_set(struct intel_rps *rps, u8 val) static void gen6_rps_init(struct intel_rps *rps) { struct drm_i915_private *i915 = rps_to_i915(rps); - struct intel_uncore *uncore = rps_to_uncore(rps); + u32 rp_state_cap = intel_rps_read_state_cap(rps); /* All of these values are in units of 50MHz */ /* static values from HW: RP0 > RP1 > RPn (min_freq) */ if (IS_GEN9_LP(i915)) { - u32 rp_state_cap = intel_uncore_read(uncore, BXT_RP_STATE_CAP); - rps->rp0_freq = (rp_state_cap >> 16) & 0xff; rps->rp1_freq = (rp_state_cap >> 8) & 0xff; rps->min_freq = (rp_state_cap >> 0) & 0xff; } else { - u32 rp_state_cap = intel_uncore_read(uncore, GEN6_RP_STATE_CAP); - rps->rp0_freq = (rp_state_cap >> 0) & 0xff; rps->rp1_freq = (rp_state_cap >> 8) & 0xff; rps->min_freq = (rp_state_cap >> 16) & 0xff; @@ -1936,6 +1932,17 @@ u32 intel_rps_read_actual_frequency(struct intel_rps *rps) return freq; } +u32 intel_rps_read_state_cap(struct intel_rps *rps) +{ + struct drm_i915_private *i915 = rps_to_i915(rps); + struct intel_uncore *uncore = rps_to_uncore(rps); + + if (IS_GEN9_LP(i915)) + return intel_uncore_read(uncore, BXT_RP_STATE_CAP); + else + return intel_uncore_read(uncore, GEN6_RP_STATE_CAP); +} + /* External interface for intel_ips.ko */ static struct drm_i915_private __rcu *ips_mchdev; diff --git a/drivers/gpu/drm/i915/gt/intel_rps.h b/drivers/gpu/drm/i915/gt/intel_rps.h index 1d2cfc98b510..6e06dd61f818 100644 --- a/drivers/gpu/drm/i915/gt/intel_rps.h +++ b/drivers/gpu/drm/i915/gt/intel_rps.h @@ -31,6 +31,7 @@ int intel_gpu_freq(struct intel_rps *rps, int val); int intel_freq_opcode(struct intel_rps *rps, int val); u32 intel_rps_get_cagf(struct intel_rps *rps, u32 rpstat1); u32 intel_rps_read_actual_frequency(struct intel_rps *rps); +u32 intel_rps_read_state_cap(struct intel_rps *rps); void gen5_rps_irq_handler(struct intel_rps *rps); void gen6_rps_irq_handler(struct intel_rps *rps, u32 pm_iir); diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index cc745751ac53..6c83da3956b9 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -420,13 +420,11 @@ static int i915_frequency_info(struct seq_file *m, void *unused) int max_freq; rp_state_limits = intel_uncore_read(&dev_priv->uncore, GEN6_RP_STATE_LIMITS); - if (IS_GEN9_LP(dev_priv)) { - rp_state_cap = intel_uncore_read(&dev_priv->uncore, BXT_RP_STATE_CAP); + rp_state_cap = intel_rps_read_state_cap(rps); + if (IS_GEN9_LP(dev_priv)) gt_perf_status = intel_uncore_read(&dev_priv->uncore, BXT_GT_PERF_STATUS); - } else { - rp_state_cap = intel_uncore_read(&dev_priv->uncore, GEN6_RP_STATE_CAP); + else gt_perf_status = intel_uncore_read(&dev_priv->uncore, GEN6_GT_PERF_STATUS); -
[PATCH 51/53] drm/i915/display/dsc: Set BPP in the kernel
From: Anusha Srivatsa Set compress BPP in kernel while connector DP or eDP Cc: Vandita Kulkarni Cc: Navare Manasi D Signed-off-by: Anusha Srivatsa Signed-off-by: Patnana Venkata Sai Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/display/intel_dp.c | 23 ++- 1 file changed, 18 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/i915/display/intel_dp.c b/drivers/gpu/drm/i915/display/intel_dp.c index 5b52beaddada..57aadee69d8b 100644 --- a/drivers/gpu/drm/i915/display/intel_dp.c +++ b/drivers/gpu/drm/i915/display/intel_dp.c @@ -1241,9 +1241,15 @@ static int intel_dp_dsc_compute_config(struct intel_dp *intel_dp, pipe_config->lane_count = limits->max_lane_count; if (intel_dp_is_edp(intel_dp)) { - pipe_config->dsc.compressed_bpp = - min_t(u16, drm_edp_dsc_sink_output_bpp(intel_dp->dsc_dpcd) >> 4, - pipe_config->pipe_bpp); + if (intel_dp->force_dsc_bpp) { + drm_dbg_kms(&dev_priv->drm, + "DSC BPC forced to %d", intel_dp->force_dsc_bpp); + pipe_config->dsc.compressed_bpp = intel_dp->force_dsc_bpp; + } else { + pipe_config->dsc.compressed_bpp = + min_t(u16, drm_edp_dsc_sink_output_bpp(intel_dp->dsc_dpcd) >> 4, + pipe_config->pipe_bpp); + } pipe_config->dsc.slice_count = drm_dp_dsc_sink_max_slice_count(intel_dp->dsc_dpcd, true); @@ -1269,9 +1275,15 @@ static int intel_dp_dsc_compute_config(struct intel_dp *intel_dp, "Compressed BPP/Slice Count not supported\n"); return -EINVAL; } - pipe_config->dsc.compressed_bpp = min_t(u16, + if (intel_dp->force_dsc_bpp) { + drm_dbg_kms(&dev_priv->drm, + "DSC BPC forced to %d\n", intel_dp->force_dsc_bpp); + pipe_config->dsc.compressed_bpp = intel_dp->force_dsc_bpp; + } else { + pipe_config->dsc.compressed_bpp = min_t(u16, dsc_max_output_bpp >> 4, pipe_config->pipe_bpp); + } pipe_config->dsc.slice_count = dsc_dp_slice_count; } /* @@ -1374,7 +1386,8 @@ intel_dp_compute_link_config(struct intel_encoder *encoder, * Pipe joiner needs compression upto display12 due to BW limitation. DG2 * onwards pipe joiner can be enabled without compression. */ - drm_dbg_kms(&i915->drm, "Force DSC en = %d\n", intel_dp->force_dsc_en); + drm_dbg_kms(&i915->drm, "Force DSC en = %d\n Force DSC BPP = %d\n", + intel_dp->force_dsc_en, intel_dp->force_dsc_bpp); if (ret || intel_dp->force_dsc_en || (DISPLAY_VER(i915) < 13 && pipe_config->bigjoiner)) { ret = intel_dp_dsc_compute_config(intel_dp, pipe_config, -- 2.25.4
[PATCH 50/53] drm/i915/display/dsc: Add Per connector debugfs node for DSC BPP enable
From: Anusha Srivatsa DSC can be supported per DP connector. This patch creates a per connector debugfs node to expose the Input and Compressed BPP. The same node can be used from userspace to force DSC to a certain BPP. force_dsc_bpp is written through this debugfs node to force DSC BPP to all accepted values Cc: Vandita Kulkarni Cc: Manasi Navare Signed-off-by: Anusha Srivatsa Signed-off-by: Patnana Venkata Sai Signed-off-by: Matt Roper --- .../drm/i915/display/intel_display_debugfs.c | 103 +- .../drm/i915/display/intel_display_types.h| 1 + 2 files changed, 103 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/display/intel_display_debugfs.c b/drivers/gpu/drm/i915/display/intel_display_debugfs.c index af9e58619667..1805d70ea817 100644 --- a/drivers/gpu/drm/i915/display/intel_display_debugfs.c +++ b/drivers/gpu/drm/i915/display/intel_display_debugfs.c @@ -2389,6 +2389,100 @@ static const struct file_operations i915_dsc_fec_support_fops = { .write = i915_dsc_fec_support_write }; +static int i915_dsc_bpp_support_show(struct seq_file *m, void *data) +{ + struct drm_connector *connector = m->private; + struct drm_device *dev = connector->dev; + struct drm_crtc *crtc; + struct intel_dp *intel_dp; + struct drm_modeset_acquire_ctx ctx; + struct intel_crtc_state *crtc_state = NULL; + int ret = 0; + bool try_again = false; + + drm_modeset_acquire_init(&ctx, DRM_MODESET_ACQUIRE_INTERRUPTIBLE); + + do { + try_again = false; + ret = drm_modeset_lock(&dev->mode_config.connection_mutex, + &ctx); + if (ret) { + ret = -EINTR; + break; + } + crtc = connector->state->crtc; + if (connector->status != connector_status_connected || !crtc) { + ret = -ENODEV; + break; + } + ret = drm_modeset_lock(&crtc->mutex, &ctx); + if (ret == -EDEADLK) { + ret = drm_modeset_backoff(&ctx); + if (!ret) { + try_again = true; + continue; + } + break; + } else if (ret) { + break; + } + intel_dp = intel_attached_dp(to_intel_connector(connector)); + crtc_state = to_intel_crtc_state(crtc->state); + seq_printf(m, "Input_BPP: %d\n", crtc_state->pipe_bpp); + seq_printf(m, "Compressed_BPP: %d\n", + crtc_state->dsc.compressed_bpp); + } while (try_again); + + drm_modeset_drop_locks(&ctx); + drm_modeset_acquire_fini(&ctx); + + return ret; +} + +static ssize_t i915_dsc_bpp_support_write(struct file *file, + const char __user *ubuf, + size_t len, loff_t *offp) +{ + int dsc_bpp = 0; + int ret; + struct drm_connector *connector = + ((struct seq_file *)file->private_data)->private; + struct intel_encoder *encoder = intel_attached_encoder(to_intel_connector(connector)); + struct drm_i915_private *i915 = to_i915(encoder->base.dev); + struct intel_dp *intel_dp = enc_to_intel_dp(encoder); + + if (len == 0) + return 0; + + drm_dbg(&i915->drm, + "Copied %zu bytes from user to force BPP\n", len); + + ret = kstrtoint_from_user(ubuf, len, 0, &dsc_bpp); + + intel_dp->force_dsc_bpp = dsc_bpp; + if (ret < 0) + return ret; + + *offp += len; + return len; +} + +static int i915_dsc_bpp_support_open(struct inode *inode, + struct file *file) +{ + return single_open(file, i915_dsc_bpp_support_show, + inode->i_private); +} + +static const struct file_operations i915_dsc_bpp_support_fops = { + .owner = THIS_MODULE, + .open = i915_dsc_bpp_support_open, + .read = seq_read, + .llseek = seq_lseek, + .release = single_release, + .write = i915_dsc_bpp_support_write +}; + /** * intel_connector_debugfs_add - add i915 specific connector debugfs files * @connector: pointer to a registered drm_connector @@ -2427,9 +2521,16 @@ int intel_connector_debugfs_add(struct drm_connector *connector) connector, &i915_hdcp_sink_capability_fops); } - if ((DISPLAY_VER(dev_priv) >= 11 || IS_CANNONLAKE(dev_priv)) && ((connector->connector_type == DRM_MODE_CONNECTOR_DisplayPort && !to_intel_connector(connector)->mst_port) || connector->connector_type == DRM_MODE_CONNECTOR_eDP)) + if ((DISPLAY_VER(dev_priv) >= 11 || IS_CANNONLAKE(dev_priv)) && +
[PATCH 20/53] drm/i915/xehpsdv: Define steering tables
Define and initialize the MMIO ranges for which XeHP SDV requires MSLICE and LNCF steering. Bspec: 66534 Cc: Tvrtko Ursulin Cc: Daniele Ceraolo Spurio Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/gt/intel_gt.c | 19 ++- drivers/gpu/drm/i915/gt/intel_workarounds.c | 11 +-- 2 files changed, 27 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c index f59bcedbb80b..9d1c99c9c0dd 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt.c +++ b/drivers/gpu/drm/i915/gt/intel_gt.c @@ -89,6 +89,20 @@ static const struct intel_mmio_range icl_l3bank_steering_table[] = { {}, }; +static const struct intel_mmio_range xehpsdv_mslice_steering_table[] = { + { 0x004000, 0x004AFF }, + { 0x00C800, 0x00CFFF }, + { 0x00DD00, 0x00DDFF }, + { 0x00E900, 0x00 }, /* 0xEA00 - OxEFFF is unused */ + {}, +}; + +static const struct intel_mmio_range xehpsdv_lncf_steering_table[] = { + { 0x00B000, 0x00B0FF }, + { 0x00D800, 0x00D8FF }, + {}, +}; + static u16 slicemask(struct intel_gt *gt, int count) { u64 dss_mask = intel_sseu_get_subslices(>->info.sseu, 0); @@ -113,7 +127,10 @@ int intel_gt_init_mmio(struct intel_gt *gt) (intel_uncore_read(gt->uncore, GEN10_MIRROR_FUSE3) & GEN12_MEML3_EN_MASK); - if (GRAPHICS_VER(gt->i915) >= 11 && + if (IS_XEHPSDV(gt->i915)) { + gt->steering_table[MSLICE] = xehpsdv_mslice_steering_table; + gt->steering_table[LNCF] = xehpsdv_lncf_steering_table; + } else if (GRAPHICS_VER(gt->i915) >= 11 && GRAPHICS_VER_FULL(gt->i915) < IP_VER(12, 50)) { gt->steering_table[L3BANK] = icl_l3bank_steering_table; gt->info.l3bank_mask = diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c index 060d84897635..4302dc1b728e 100644 --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c @@ -989,7 +989,6 @@ icl_wa_init_mcr(struct drm_i915_private *i915, struct i915_wa_list *wal) __add_mcr_wa(i915, wal, slice, subslice); } -__maybe_unused static void xehp_init_mcr(struct intel_gt *gt, struct i915_wa_list *wal) { @@ -1208,10 +1207,18 @@ dg1_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal) VSUNIT_CLKGATE_DIS_TGL); } +static void +xehpsdv_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal) +{ + xehp_init_mcr(&i915->gt, wal); +} + static void gt_init_workarounds(struct drm_i915_private *i915, struct i915_wa_list *wal) { - if (IS_DG1(i915)) + if (IS_XEHPSDV(i915)) + xehpsdv_gt_workarounds_init(i915, wal); + else if (IS_DG1(i915)) dg1_gt_workarounds_init(i915, wal); else if (IS_TIGERLAKE(i915)) tgl_gt_workarounds_init(i915, wal); -- 2.25.4
[PATCH 21/53] drm/i915/xehpsdv: Define MOCS table for XeHP SDV
From: Lucas De Marchi Like DG1, XeHP SDV doesn't have LLC/eDRAM control values due to being a dgfx card. XeHP SDV adds 2 more bits: L3_GLBGO to "push the Go point to memory for L3 destined transaction" and L3_LKP to "enable Lookup for uncacheable accesses". Bspec: 45101 Cc: Daniele Ceraolo Spurio Signed-off-by: Lucas De Marchi Signed-off-by: Stuart Summers Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/gt/intel_mocs.c | 33 +++- 1 file changed, 32 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gt/intel_mocs.c b/drivers/gpu/drm/i915/gt/intel_mocs.c index 17848807f111..0c9d0b936c20 100644 --- a/drivers/gpu/drm/i915/gt/intel_mocs.c +++ b/drivers/gpu/drm/i915/gt/intel_mocs.c @@ -40,6 +40,8 @@ struct drm_i915_mocs_table { #define L3_ESC(value) ((value) << 0) #define L3_SCC(value) ((value) << 1) #define _L3_CACHEABILITY(value)((value) << 4) +#define L3_GLBGO(value)((value) << 6) +#define L3_LKUP(value) ((value) << 7) /* Helper defines */ #define GEN9_NUM_MOCS_ENTRIES 64 /* 63-64 are reserved, but configured. */ @@ -314,6 +316,31 @@ static const struct drm_i915_mocs_entry dg1_mocs_table[] = { MOCS_ENTRY(63, 0, L3_1_UC), }; +static const struct drm_i915_mocs_entry xehpsdv_mocs_table[] = { + /* wa_1608975824 */ + MOCS_ENTRY(0, 0, L3_3_WB | L3_LKUP(1)), + + /* UC - Coherent; GO:L3 */ + MOCS_ENTRY(1, 0, L3_1_UC | L3_LKUP(1)), + /* UC - Coherent; GO:Memory */ + MOCS_ENTRY(2, 0, L3_1_UC | L3_GLBGO(1) | L3_LKUP(1)), + /* UC - Non-Coherent; GO:Memory */ + MOCS_ENTRY(3, 0, L3_1_UC | L3_GLBGO(1)), + /* UC - Non-Coherent; GO:L3 */ + MOCS_ENTRY(4, 0, L3_1_UC), + + /* WB */ + MOCS_ENTRY(5, 0, L3_3_WB | L3_LKUP(1)), + + /* HW Reserved - SW program but never use. */ + MOCS_ENTRY(48, 0, L3_3_WB | L3_LKUP(1)), + MOCS_ENTRY(49, 0, L3_1_UC | L3_LKUP(1)), + MOCS_ENTRY(60, 0, L3_1_UC), + MOCS_ENTRY(61, 0, L3_1_UC), + MOCS_ENTRY(62, 0, L3_1_UC), + MOCS_ENTRY(63, 0, L3_1_UC), +}; + enum { HAS_GLOBAL_MOCS = BIT(0), HAS_ENGINE_MOCS = BIT(1), @@ -340,7 +367,11 @@ static unsigned int get_mocs_settings(const struct drm_i915_private *i915, { unsigned int flags; - if (IS_DG1(i915)) { + if (IS_XEHPSDV(i915)) { + table->size = ARRAY_SIZE(xehpsdv_mocs_table); + table->table = xehpsdv_mocs_table; + table->n_entries = GEN9_NUM_MOCS_ENTRIES; + } else if (IS_DG1(i915)) { table->size = ARRAY_SIZE(dg1_mocs_table); table->table = dg1_mocs_table; table->n_entries = GEN9_NUM_MOCS_ENTRIES; -- 2.25.4
[PATCH 23/53] drm/i915/xehpsdv: Read correct RP_STATE_CAP register
The RP_STATE_CAP register is no longer part of the MCHBAR on XEHPSDV; this register is now a per-tile register at GTTMMADDR offset 0x250014. Cc: Rodrigo Vivi Signed-off-by: Matt Roper Signed-off-by: Lucas De Marchi --- drivers/gpu/drm/i915/gt/intel_rps.c | 4 +++- drivers/gpu/drm/i915/i915_reg.h | 1 + 2 files changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c b/drivers/gpu/drm/i915/gt/intel_rps.c index 490bc1513480..8e7b70248392 100644 --- a/drivers/gpu/drm/i915/gt/intel_rps.c +++ b/drivers/gpu/drm/i915/gt/intel_rps.c @@ -1937,7 +1937,9 @@ u32 intel_rps_read_state_cap(struct intel_rps *rps) struct drm_i915_private *i915 = rps_to_i915(rps); struct intel_uncore *uncore = rps_to_uncore(rps); - if (IS_GEN9_LP(i915)) + if (IS_XEHPSDV(i915)) + return intel_uncore_read(uncore, XEHPSDV_RP_STATE_CAP); + else if (IS_GEN9_LP(i915)) return intel_uncore_read(uncore, BXT_RP_STATE_CAP); else return intel_uncore_read(uncore, GEN6_RP_STATE_CAP); diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 0231f42226db..2992e8585399 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -4110,6 +4110,7 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg) #define GEN6_RP_STATE_CAP _MMIO(MCHBAR_MIRROR_BASE_SNB + 0x5998) #define BXT_RP_STATE_CAP_MMIO(0x138170) #define GEN9_RP_STATE_LIMITS _MMIO(0x138148) +#define XEHPSDV_RP_STATE_CAP _MMIO(0x250014) /* * Logical Context regs -- 2.25.4
[PATCH 38/53] drm/i915/dg2: Add dbuf programming
DG2 extends our DDB to four DBuf slices; pipes A+B only have access to the first two slices, whereas pipes C+D only have access to the second two. Confusingly, our bspec decided to switch from 1-based numbering of dbuf slices (S1, S2) to 0-based numbering (S0, S1, S2, S3) in Display13. At the moment we're using the 0-based number scheme for the DBUF_CTL_S() register addressing, but the 1-based number scheme in the actual slice assignment tables. We may want to consider switching the assignment over to 0-based numbering too at some point... Bspec: 49255 Bspec: 50057 Cc: Stanislav Lisovskiy Signed-off-by: Matt Roper --- .../drm/i915/display/intel_display_power.h| 4 + drivers/gpu/drm/i915/intel_pm.c | 120 +- 2 files changed, 123 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/display/intel_display_power.h b/drivers/gpu/drm/i915/display/intel_display_power.h index 22367b5cba96..ad788bbd727d 100644 --- a/drivers/gpu/drm/i915/display/intel_display_power.h +++ b/drivers/gpu/drm/i915/display/intel_display_power.h @@ -392,6 +392,10 @@ intel_display_power_put_all_in_set(struct drm_i915_private *i915, intel_display_power_put_mask_in_set(i915, power_domain_set, power_domain_set->mask); } +/* + * FIXME: We should probably switch this to a 0-based scheme to be consistent + * with how we now name/number DBUF_CTL instances. + */ enum dbuf_slice { DBUF_S1, DBUF_S2, diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c index 5fdb96e7d266..ff8d89fff502 100644 --- a/drivers/gpu/drm/i915/intel_pm.c +++ b/drivers/gpu/drm/i915/intel_pm.c @@ -4584,6 +4584,117 @@ static const struct dbuf_slice_conf_entry tgl_allowed_dbufs[] = {} }; +static const struct dbuf_slice_conf_entry dg2_allowed_dbufs[] = { + { + .active_pipes = BIT(PIPE_A), + .dbuf_mask = { + [PIPE_A] = BIT(DBUF_S1) | BIT(DBUF_S2), + }, + }, + { + .active_pipes = BIT(PIPE_B), + .dbuf_mask = { + [PIPE_B] = BIT(DBUF_S1) | BIT(DBUF_S2), + }, + }, + { + .active_pipes = BIT(PIPE_A) | BIT(PIPE_B), + .dbuf_mask = { + [PIPE_A] = BIT(DBUF_S1), + [PIPE_B] = BIT(DBUF_S2), + }, + }, + { + .active_pipes = BIT(PIPE_C), + .dbuf_mask = { + [PIPE_C] = BIT(DBUF_S3) | BIT(DBUF_S4), + }, + }, + { + .active_pipes = BIT(PIPE_A) | BIT(PIPE_C), + .dbuf_mask = { + [PIPE_A] = BIT(DBUF_S1) | BIT(DBUF_S2), + [PIPE_C] = BIT(DBUF_S3) | BIT(DBUF_S4), + }, + }, + { + .active_pipes = BIT(PIPE_B) | BIT(PIPE_C), + .dbuf_mask = { + [PIPE_B] = BIT(DBUF_S1) | BIT(DBUF_S2), + [PIPE_C] = BIT(DBUF_S3) | BIT(DBUF_S4), + }, + }, + { + .active_pipes = BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_C), + .dbuf_mask = { + [PIPE_A] = BIT(DBUF_S1), + [PIPE_B] = BIT(DBUF_S2), + [PIPE_C] = BIT(DBUF_S3) | BIT(DBUF_S4), + }, + }, + { + .active_pipes = BIT(PIPE_D), + .dbuf_mask = { + [PIPE_D] = BIT(DBUF_S3) | BIT(DBUF_S4), + }, + }, + { + .active_pipes = BIT(PIPE_A) | BIT(PIPE_D), + .dbuf_mask = { + [PIPE_A] = BIT(DBUF_S1) | BIT(DBUF_S2), + [PIPE_D] = BIT(DBUF_S3) | BIT(DBUF_S4), + }, + }, + { + .active_pipes = BIT(PIPE_B) | BIT(PIPE_D), + .dbuf_mask = { + [PIPE_B] = BIT(DBUF_S1) | BIT(DBUF_S2), + [PIPE_D] = BIT(DBUF_S3) | BIT(DBUF_S4), + }, + }, + { + .active_pipes = BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_D), + .dbuf_mask = { + [PIPE_A] = BIT(DBUF_S1), + [PIPE_B] = BIT(DBUF_S2), + [PIPE_D] = BIT(DBUF_S3) | BIT(DBUF_S4), + }, + }, + { + .active_pipes = BIT(PIPE_C) | BIT(PIPE_D), + .dbuf_mask = { + [PIPE_C] = BIT(DBUF_S3), + [PIPE_D] = BIT(DBUF_S4), + }, + }, + { + .active_pipes = BIT(PIPE_A) | BIT(PIPE_C) | BIT(PIPE_D), + .dbuf_mask = { + [PIPE_A] = BIT(DBUF_S1) | BIT(DBUF_S2), + [PIPE_C] = BIT(DBUF_S3), + [PIPE_D] = BIT(DBUF_S4), + }, + }, + { + .active_pipes =
[PATCH 39/53] drm/i915/dg2: Don't program BW_BUDDY registers
Although the BW_BUDDY registers still exist, they are not used for anything on DG2. This change is expected to hold true for future dgpu's too. Bspec: 49218 Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/display/intel_display_power.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/gpu/drm/i915/display/intel_display_power.c b/drivers/gpu/drm/i915/display/intel_display_power.c index c34ff0947b85..df6358638fee 100644 --- a/drivers/gpu/drm/i915/display/intel_display_power.c +++ b/drivers/gpu/drm/i915/display/intel_display_power.c @@ -5814,6 +5814,10 @@ static void tgl_bw_buddy_init(struct drm_i915_private *dev_priv) unsigned long abox_mask = INTEL_INFO(dev_priv)->abox_mask; int config, i; + /* BW_BUDDY registers are not used on dgpu's beyond DG1 */ + if (IS_DGFX(dev_priv) && !IS_DG1(dev_priv)) + return; + if (IS_ALDERLAKE_S(dev_priv) || IS_DG1_REVID(dev_priv, DG1_REVID_A0, DG1_REVID_A0) || IS_TGL_DISPLAY_STEP(dev_priv, STEP_A0, STEP_B0)) -- 2.25.4
[PATCH 30/53] drm/i915/dg2: Maintain backward-compatible nested batch behavior
For tgl+, the per-context setting of MI_MODE[12] determines whether the bits of a nested MI_BATCH_BUFFER_START instruction should be interpreted in the traditional manner or whether they should instead use a new tgl+ meaning that breaks backward compatibility, but allows nesting into 3rd-level batchbuffers. For previous platforms, the hardware default for this register bit is to maintain backward-compatible behavior unless a context intentionally opts into the new behavior; however Xe_HPG flips the hardware default behavior. >From a SW perspective, we want to maintain the backward-compatible behavior for userspace, so we'll apply a fake workaround to set it back to the legacy behavior on platforms where the hardware default is to break compatibility. At the moment there is no Linux userspace that utilizes third-level batchbuffers, so this will avoid userspace from needing to make any changes. using the legacy meaning is the correct thing to do. If/when we have userspace consumers that want to utilize third-level batch nesting, we can provide a context parameter to allow them to opt-in. Bspec: 45974, 45718 Cc: John Harrison Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/gt/intel_workarounds.c | 39 +++-- drivers/gpu/drm/i915/i915_reg.h | 1 + 2 files changed, 38 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c index f97ff2848122..43db766b0672 100644 --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c @@ -686,6 +686,37 @@ static void dg1_ctx_workarounds_init(struct intel_engine_cs *engine, DG1_HZ_READ_SUPPRESSION_OPTIMIZATION_DISABLE); } +static void fakewa_disable_nestedbb_mode(struct intel_engine_cs *engine, +struct i915_wa_list *wal) +{ + /* +* This is a "fake" workaround defined by software to ensure we +* maintain reliable, backward-compatible behavior for userspace with +* regards to how nested MI_BATCH_BUFFER_START commands are handled. +* +* The per-context setting of MI_MODE[12] determines whether the bits +* of a nested MI_BATCH_BUFFER_START instruction should be interpreted +* in the traditional manner or whether they should instead use a new +* tgl+ meaning that breaks backward compatibility, but allows nesting +* into 3rd-level batchbuffers. When this new capability was first +* added in TGL, it remained off by default unless a context +* intentionally opted in to the new behavior. However Xe_HPG now +* flips this on by default and requires that we explicitly opt out if +* we don't want the new behavior. +* +* From a SW perspective, we want to maintain the backward-compatible +* behavior for userspace, so we'll apply a fake workaround to set it +* back to the legacy behavior on platforms where the hardware default +* is to break compatibility. At the moment there is no Linux +* userspace that utilizes third-level batchbuffers, so this will avoid +* userspace from needing to make any changes. using the legacy +* meaning is the correct thing to do. If/when we have userspace +* consumers that want to utilize third-level batch nesting, we can +* provide a context parameter to allow them to opt-in. +*/ + wa_masked_dis(wal, RING_MI_MODE(engine->mmio_base), TGL_NESTED_BB_EN); +} + static void __intel_engine_init_ctx_wa(struct intel_engine_cs *engine, struct i915_wa_list *wal, @@ -693,11 +724,15 @@ __intel_engine_init_ctx_wa(struct intel_engine_cs *engine, { struct drm_i915_private *i915 = engine->i915; + wa_init_start(wal, name, engine->name); + + /* Applies to all engines */ + if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 55)) + fakewa_disable_nestedbb_mode(engine, wal); + if (engine->class != RENDER_CLASS) return; - wa_init_start(wal, name, engine->name); - if (IS_DG1(i915)) dg1_ctx_workarounds_init(engine, wal); else if (GRAPHICS_VER(i915) == 12) diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index b19d102e0a01..35a42df1f2aa 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -2821,6 +2821,7 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg) #define MI_MODE_MMIO(0x209c) # define VS_TIMER_DISPATCH (1 << 6) # define MI_FLUSH_ENABLE (1 << 12) +# define TGL_NESTED_BB_EN (1 << 12) # define ASYNC_FLIP_PERF_DISABLE (1 << 14) # define MODE_IDLE (1 << 9) # define STOP_RING
[PATCH 26/53] drm/i915/dg2: Add forcewake table
The DG2 forcewake table is very similar to the one used by XeHP SDV (and both platforms are even presented as a single table in the bspec). For the most part DG2 starts using a few additional ranges that were 'reserved' on XeHP SDV and stops using some others. However there is a single range (0xd800-0xd87f) that needs to be handled differently between the two platforms (it needs GT wake on XeHP SDV, but render wake on DG2) so unless we want to wake both domains (which could waste power) or define new types of forcewake domains for this special case we need to have separate tables for the two platforms. Let's define the ranges for both platforms with a parameterized macro so that we don't actually need to duplicate everything in the code. It should be fine for DG2 to re-use the Xe_HP shadow register list so we can continue to use the 'xehpsdv' MMIO write functions and don't need to spin up a separate DG2 instance. Bspec: 66534 Cc: Daniele Ceraolo Spurio Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/intel_uncore.c | 305 +++- 1 file changed, 168 insertions(+), 137 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c index 676b0052f01e..0c35acfcd6da 100644 --- a/drivers/gpu/drm/i915/intel_uncore.c +++ b/drivers/gpu/drm/i915/intel_uncore.c @@ -1317,143 +1317,170 @@ static const struct intel_forcewake_range __gen12_fw_ranges[] = { 0x1d3f00 - 0x1d3fff: VD2 */ }; -/* *Must* be sorted by offset ranges! See intel_fw_table_check(). */ -static const struct intel_forcewake_range __xehp_fw_ranges[] = { - GEN_FW_RANGE(0x0, 0x1fff, 0), /* - 0x0 - 0xaff: reserved - 0xb00 - 0x1fff: always on */ - GEN_FW_RANGE(0x2000, 0x26ff, FORCEWAKE_RENDER), - GEN_FW_RANGE(0x2700, 0x4aff, FORCEWAKE_GT), - GEN_FW_RANGE(0x4b00, 0x51ff, 0), /* - 0x4b00 - 0x4fff: reserved - 0x5000 - 0x51ff: always on */ - GEN_FW_RANGE(0x5200, 0x7fff, FORCEWAKE_RENDER), - GEN_FW_RANGE(0x8000, 0x813f, FORCEWAKE_GT), - GEN_FW_RANGE(0x8140, 0x815f, FORCEWAKE_RENDER), - GEN_FW_RANGE(0x8160, 0x81ff, 0), /* - 0x8160 - 0x817f: reserved - 0x8180 - 0x81ff: always on */ - GEN_FW_RANGE(0x8200, 0x82ff, FORCEWAKE_GT), - GEN_FW_RANGE(0x8300, 0x84ff, FORCEWAKE_RENDER), - GEN_FW_RANGE(0x8500, 0x94cf, FORCEWAKE_GT), /* - 0x8500 - 0x87ff: gt - 0x8800 - 0x8fff: reserved - 0x9000 - 0x947f: gt - 0x9480 - 0x94cf: reserved */ - GEN_FW_RANGE(0x94d0, 0x955f, FORCEWAKE_RENDER), - GEN_FW_RANGE(0x9560, 0x97ff, 0), /* - 0x9560 - 0x95ff: always on - 0x9600 - 0x97ff: reserved */ - GEN_FW_RANGE(0x9800, 0xcfff, FORCEWAKE_GT), /* - 0x9800 - 0xb4ff: gt - 0xb500 - 0xbfff: reserved - 0xc000 - 0xcfff: gt */ - GEN_FW_RANGE(0xd000, 0xd7ff, 0), - GEN_FW_RANGE(0xd800, 0xdbff, FORCEWAKE_GT), - GEN_FW_RANGE(0xdc00, 0xdcff, FORCEWAKE_RENDER), - GEN_FW_RANGE(0xdd00, 0xde7f, FORCEWAKE_GT), /* - 0xdd00 - 0xddff: gt - 0xde00 - 0xde7f: reserved */ - GEN_FW_RANGE(0xde80, 0xe8ff, FORCEWAKE_RENDER), /* - 0xde80 - 0xdfff: render - 0xe000 - 0xe0ff: reserved - 0xe100 - 0xe8ff: render */ - GEN_FW_RANGE(0xe900, 0x, FORCEWAKE_GT), /* - 0xe900 - 0xe9ff: gt - 0xea00 - 0xefff: reserved - 0xf000 - 0x: gt */ - GEN_FW_RANGE(0x1, 0x13fff, 0), /* - 0x1 - 0x11fff: reserved - 0x12000 - 0x127ff: always on - 0x12800 - 0x13fff: reserved */ - GEN_FW_RANGE(0x14000, 0x141ff, FORCEWAKE_MEDIA_VDBOX0), - GEN_FW_RANGE(0x14200, 0x143ff, FORCEWAKE_MEDIA_VDBOX2), - GEN_FW_RANGE(0x14400, 0x145ff, FORCEWAKE_MEDIA_VDBOX4), - GEN_FW_RANGE(0x14600, 0x147ff, FORCEWAKE_MEDIA_VDBOX6), - GEN_FW_RANGE(0x14800, 0x1, FORCEWAKE_RENDER), /* - 0x14800 - 0x14fff: render - 0x15000 - 0x16dff: reserved - 0x16e00 - 0x1: render */ - GEN_FW_RANGE(0x2, 0x21fff, FORCEWAKE_MEDIA_VDBOX0), /* - 0x2 - 0x20fff: VD0 - 0x21000 - 0x21fff: reserved */ - GEN_FW_RANGE(0x22000, 0x23fff, FORCEWAKE_GT), - GEN_FW_RANGE(0x24000, 0x2417f, 0), /* - 0x24000 - 0x2407f: always on - 0x24080 - 0x2417f: reserved */ - GEN_FW_RANGE(0x24180, 0x249ff, FORCEWAKE_GT), /* - 0x24180 - 0x241ff: gt - 0x24200 - 0x249ff: reserved */ - GEN_FW_RANGE(0x24a00, 0x251ff, FORCEWAKE_RENDER), /* - 0x24a00 - 0x24a7f: render - 0x24a80 - 0x251ff: reserved */ - GEN_FW_RANGE(0x25200, 0x25fff, FORCEWAKE_GT), /* - 0x25200 - 0x252ff: gt - 0x25300 - 0x25
[PATCH 41/53] drm/i915/dg2: DG2 has fixed memory bandwidth
DG2 doesn't have a SAGV or QGV points that determine memory bandwidth. Instead it has a constant amount of memory bandwidth available to display that does not need to be reduced based on the number of active planes. For simplicity, we'll just modify driver initialization to create a single dummy QGV point with the proper amount of memory bandwidth, rather than trying to query the pcode for this information. Bspec: 64631 Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/display/intel_bw.c | 24 +++- 1 file changed, 23 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/display/intel_bw.c b/drivers/gpu/drm/i915/display/intel_bw.c index bfb398f0432e..4ca83874d0aa 100644 --- a/drivers/gpu/drm/i915/display/intel_bw.c +++ b/drivers/gpu/drm/i915/display/intel_bw.c @@ -234,6 +234,26 @@ static int icl_get_bw_info(struct drm_i915_private *dev_priv, const struct intel return 0; } +static void dg2_get_bw_info(struct drm_i915_private *i915) +{ + struct intel_bw_info *bi = &i915->max_bw[0]; + + /* +* DG2 doesn't have SAGV or QGV points, just a constant max bandwidth +* that doesn't depend on the number of planes enabled. Create a +* single dummy QGV point to reflect that. DG2-G10 platforms have a +* constant 50 GB/s bandwidth, whereas DG2-G11 platforms have 38 GB/s. +*/ + bi->num_planes = 1; + bi->num_qgv_points = 1; + if (IS_DG2_G11(i915)) + bi->deratedbw[0] = 38000; + else + bi->deratedbw[0] = 5; + + i915->sagv_status = I915_SAGV_NOT_CONTROLLED; +} + static unsigned int icl_max_bw(struct drm_i915_private *dev_priv, int num_planes, int qgv_point) { @@ -267,7 +287,9 @@ void intel_bw_init_hw(struct drm_i915_private *dev_priv) if (!HAS_DISPLAY(dev_priv)) return; - if (IS_ALDERLAKE_S(dev_priv) || IS_ALDERLAKE_P(dev_priv)) + if (IS_DG2(dev_priv)) + dg2_get_bw_info(dev_priv); + else if (IS_ALDERLAKE_S(dev_priv) || IS_ALDERLAKE_P(dev_priv)) icl_get_bw_info(dev_priv, &adls_sa_info); else if (IS_ROCKETLAKE(dev_priv)) icl_get_bw_info(dev_priv, &rkl_sa_info); -- 2.25.4
[PATCH 06/53] drm/i915/selftests: Allow for larger engine counts
From: John Harrison Increasing the engine count causes a couple of local array variables to exceed the kernel stack limit. So make them dynamic allocations instead. Signed-off-by: John Harrison Signed-off-by: Daniele Ceraolo Spurio Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/gt/selftest_execlists.c | 10 -- .../gpu/drm/i915/gt/selftest_workarounds.c| 32 --- 2 files changed, 29 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/selftest_execlists.c b/drivers/gpu/drm/i915/gt/selftest_execlists.c index 08896ae027d5..1e7fe479 100644 --- a/drivers/gpu/drm/i915/gt/selftest_execlists.c +++ b/drivers/gpu/drm/i915/gt/selftest_execlists.c @@ -3561,12 +3561,16 @@ static int smoke_crescendo(struct preempt_smoke *smoke, unsigned int flags) #define BATCH BIT(0) { struct task_struct *tsk[I915_NUM_ENGINES] = {}; - struct preempt_smoke arg[I915_NUM_ENGINES]; + struct preempt_smoke *arg; struct intel_engine_cs *engine; enum intel_engine_id id; unsigned long count; int err = 0; + arg = kmalloc_array(I915_NUM_ENGINES, sizeof(*arg), GFP_KERNEL); + if (!arg) + return -ENOMEM; + for_each_engine(engine, smoke->gt, id) { arg[id] = *smoke; arg[id].engine = engine; @@ -3574,7 +3578,7 @@ static int smoke_crescendo(struct preempt_smoke *smoke, unsigned int flags) arg[id].batch = NULL; arg[id].count = 0; - tsk[id] = kthread_run(smoke_crescendo_thread, &arg, + tsk[id] = kthread_run(smoke_crescendo_thread, arg, "igt/smoke:%d", id); if (IS_ERR(tsk[id])) { err = PTR_ERR(tsk[id]); @@ -3603,6 +3607,8 @@ static int smoke_crescendo(struct preempt_smoke *smoke, unsigned int flags) pr_info("Submitted %lu crescendo:%x requests across %d engines and %d contexts\n", count, flags, smoke->gt->info.num_engines, smoke->ncontext); + + kfree(arg); return 0; } diff --git a/drivers/gpu/drm/i915/gt/selftest_workarounds.c b/drivers/gpu/drm/i915/gt/selftest_workarounds.c index 7ebc4edb8ecf..7a38ce40feb2 100644 --- a/drivers/gpu/drm/i915/gt/selftest_workarounds.c +++ b/drivers/gpu/drm/i915/gt/selftest_workarounds.c @@ -1175,31 +1175,36 @@ live_gpu_reset_workarounds(void *arg) { struct intel_gt *gt = arg; intel_wakeref_t wakeref; - struct wa_lists lists; + struct wa_lists *lists; bool ok; if (!intel_has_gpu_reset(gt)) return 0; + lists = kzalloc(sizeof(*lists), GFP_KERNEL); + if (!lists) + return -ENOMEM; + pr_info("Verifying after GPU reset...\n"); igt_global_reset_lock(gt); wakeref = intel_runtime_pm_get(gt->uncore->rpm); - reference_lists_init(gt, &lists); + reference_lists_init(gt, lists); - ok = verify_wa_lists(gt, &lists, "before reset"); + ok = verify_wa_lists(gt, lists, "before reset"); if (!ok) goto out; intel_gt_reset(gt, ALL_ENGINES, "live_workarounds"); - ok = verify_wa_lists(gt, &lists, "after reset"); + ok = verify_wa_lists(gt, lists, "after reset"); out: - reference_lists_fini(gt, &lists); + reference_lists_fini(gt, lists); intel_runtime_pm_put(gt->uncore->rpm, wakeref); igt_global_reset_unlock(gt); + kfree(lists); return ok ? 0 : -ESRCH; } @@ -1214,16 +1219,20 @@ live_engine_reset_workarounds(void *arg) struct igt_spinner spin; struct i915_request *rq; intel_wakeref_t wakeref; - struct wa_lists lists; + struct wa_lists *lists; int ret = 0; if (!intel_has_reset_engine(gt)) return 0; + lists = kzalloc(sizeof(*lists), GFP_KERNEL); + if (!lists) + return -ENOMEM; + igt_global_reset_lock(gt); wakeref = intel_runtime_pm_get(gt->uncore->rpm); - reference_lists_init(gt, &lists); + reference_lists_init(gt, lists); for_each_engine(engine, gt, id) { bool ok; @@ -1235,7 +1244,7 @@ live_engine_reset_workarounds(void *arg) break; } - ok = verify_wa_lists(gt, &lists, "before reset"); + ok = verify_wa_lists(gt, lists, "before reset"); if (!ok) { ret = -ESRCH; goto err; @@ -1247,7 +1256,7 @@ live_engine_reset_workarounds(void *arg) goto err; } - ok = verify_wa_lists(gt, &lists, "after idle reset"); + ok = verify_wa_lists(gt, lists, "after idle reset"); if (!ok) { ret = -ESRCH; goto err; @@ -1282,7 +1291,7 @@ live_engine_reset_workarounds(
[PATCH 37/53] drm/i915/dg2: Setup display outputs
DG2 has outputs on DDI A-D attached to what the bspec diagram shows as "Combo PHY A-D." Note that despite being labelled "combo" the PHYs on these outputs are Synopsys PHYs rather than traditional Intel combo PHY technology. Cc: Anusha Srivatsa Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/display/intel_display.c | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c index c673d0c8fb4a..dc2b943a4e72 100644 --- a/drivers/gpu/drm/i915/display/intel_display.c +++ b/drivers/gpu/drm/i915/display/intel_display.c @@ -11329,7 +11329,12 @@ static void intel_setup_outputs(struct drm_i915_private *dev_priv) if (!HAS_DISPLAY(dev_priv)) return; - if (IS_ALDERLAKE_P(dev_priv)) { + if (IS_DG2(dev_priv)) { + intel_ddi_init(dev_priv, PORT_A); + intel_ddi_init(dev_priv, PORT_B); + intel_ddi_init(dev_priv, PORT_C); + intel_ddi_init(dev_priv, PORT_D_XELPD); + } else if (IS_ALDERLAKE_P(dev_priv)) { intel_ddi_init(dev_priv, PORT_A); intel_ddi_init(dev_priv, PORT_B); intel_ddi_init(dev_priv, PORT_TC1); -- 2.25.4
[PATCH 31/53] drm/i915/dg2: Report INSTDONE_GEOM values in error state
Xe_HPG adds some additional INSTDONE_GEOM debug registers; the Mesa team has indicated that having these reported in the error state would be useful for debugging GPU hangs. These registers are replicated per-DSS with gslice steering. Cc: Lionel Landwerlin Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/gt/intel_engine_cs.c| 7 +++ drivers/gpu/drm/i915/gt/intel_engine_types.h | 3 +++ drivers/gpu/drm/i915/i915_gpu_error.c| 10 -- drivers/gpu/drm/i915/i915_reg.h | 1 + 4 files changed, 19 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c index e1302e9c168b..b3c002e4ae9f 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -1220,6 +1220,13 @@ void intel_engine_get_instdone(const struct intel_engine_cs *engine, GEN7_ROW_INSTDONE); } } + + if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 55)) { + for_each_instdone_gslice_dss_xehp(i915, sseu, iter, slice, subslice) + instdone->geom_svg[slice][subslice] = + read_subslice_reg(engine, slice, subslice, + XEHPG_INSTDONE_GEOM_SVG); + } } else if (GRAPHICS_VER(i915) >= 7) { instdone->instdone = intel_uncore_read(uncore, RING_INSTDONE(mmio_base)); diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h index e917b7519f2b..93609d797ac2 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h @@ -80,6 +80,9 @@ struct intel_instdone { u32 slice_common_extra[2]; u32 sampler[GEN_MAX_GSLICES][I915_MAX_SUBSLICES]; u32 row[GEN_MAX_GSLICES][I915_MAX_SUBSLICES]; + + /* Added in XeHPG */ + u32 geom_svg[GEN_MAX_GSLICES][I915_MAX_SUBSLICES]; }; /* diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index c1e744b5ab47..4de7edc451ef 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -431,6 +431,7 @@ static void error_print_instdone(struct drm_i915_error_state_buf *m, const struct sseu_dev_info *sseu = &ee->engine->gt->info.sseu; int slice; int subslice; + int iter; err_printf(m, " INSTDONE: 0x%08x\n", ee->instdone.instdone); @@ -445,8 +446,6 @@ static void error_print_instdone(struct drm_i915_error_state_buf *m, return; if (GRAPHICS_VER_FULL(m->i915) >= IP_VER(12, 50)) { - int iter; - for_each_instdone_gslice_dss_xehp(m->i915, sseu, iter, slice, subslice) err_printf(m, " SAMPLER_INSTDONE[%d][%d]: 0x%08x\n", slice, subslice, @@ -471,6 +470,13 @@ static void error_print_instdone(struct drm_i915_error_state_buf *m, if (GRAPHICS_VER(m->i915) < 12) return; + if (GRAPHICS_VER_FULL(m->i915) >= IP_VER(12, 55)) { + for_each_instdone_gslice_dss_xehp(m->i915, sseu, iter, slice, subslice) + err_printf(m, " GEOM_SVGUNIT_INSTDONE[%d][%d]: 0x%08x\n", + slice, subslice, + ee->instdone.geom_svg[slice][subslice]); + } + err_printf(m, " SC_INSTDONE_EXTRA: 0x%08x\n", ee->instdone.slice_common_extra[0]); err_printf(m, " SC_INSTDONE_EXTRA2: 0x%08x\n", diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 35a42df1f2aa..d58864c7adc6 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -2686,6 +2686,7 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg) #define GEN12_SC_INSTDONE_EXTRA2 _MMIO(0x7108) #define GEN7_SAMPLER_INSTDONE _MMIO(0xe160) #define GEN7_ROW_INSTDONE _MMIO(0xe164) +#define XEHPG_INSTDONE_GEOM_SVG_MMIO(0x666c) #define MCFG_MCR_SELECTOR _MMIO(0xfd0) #define SF_MCR_SELECTOR_MMIO(0xfd8) #define GEN8_MCR_SELECTOR _MMIO(0xfdc) -- 2.25.4
[PATCH 14/53] drm/i915/xehp: handle new steering options
From: Daniele Ceraolo Spurio Xe_HP is more modular then its predecessors and as a consequence it has more types of replicated registers. As with l3bank regions on previous platforms, we may need to explicitly re-steer accesses to these new types of ranges at runtime if we can't find a single default steering value that satisfies the fusing of all types. Bspec: 66534 Cc: Tvrtko Ursulin Signed-off-by: Daniele Ceraolo Spurio Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/gt/intel_gt.c | 40 - drivers/gpu/drm/i915/gt/intel_gt.h | 1 + drivers/gpu/drm/i915/gt/intel_gt_types.h| 7 ++ drivers/gpu/drm/i915/gt/intel_region_lmem.c | 1 + drivers/gpu/drm/i915/gt/intel_sseu.c| 18 + drivers/gpu/drm/i915/gt/intel_sseu.h| 6 ++ drivers/gpu/drm/i915/gt/intel_workarounds.c | 89 +++-- drivers/gpu/drm/i915/i915_drv.h | 3 + drivers/gpu/drm/i915/i915_pci.c | 1 + drivers/gpu/drm/i915/i915_reg.h | 4 + drivers/gpu/drm/i915/intel_device_info.h| 1 + 11 files changed, 165 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c index e714e21c0a4d..f59bcedbb80b 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt.c +++ b/drivers/gpu/drm/i915/gt/intel_gt.c @@ -89,6 +89,13 @@ static const struct intel_mmio_range icl_l3bank_steering_table[] = { {}, }; +static u16 slicemask(struct intel_gt *gt, int count) +{ + u64 dss_mask = intel_sseu_get_subslices(>->info.sseu, 0); + + return intel_slicemask_from_dssmask(dss_mask, count); +} + int intel_gt_init_mmio(struct intel_gt *gt) { intel_gt_init_clock_frequency(gt); @@ -96,11 +103,24 @@ int intel_gt_init_mmio(struct intel_gt *gt) intel_uc_init_mmio(>->uc); intel_sseu_info_init(gt); - if (GRAPHICS_VER(gt->i915) >= 11) { + /* +* An mslice is unavailable only if both the meml3 for the slice is +* disabled *and* all of the DSS in the slice (quadrant) are disabled. +*/ + if (HAS_MSLICES(gt->i915)) + gt->info.mslice_mask = + slicemask(gt, GEN_DSS_PER_MSLICE) | + (intel_uncore_read(gt->uncore, GEN10_MIRROR_FUSE3) & +GEN12_MEML3_EN_MASK); + + if (GRAPHICS_VER(gt->i915) >= 11 && + GRAPHICS_VER_FULL(gt->i915) < IP_VER(12, 50)) { gt->steering_table[L3BANK] = icl_l3bank_steering_table; gt->info.l3bank_mask = ~intel_uncore_read(gt->uncore, GEN10_MIRROR_FUSE3) & GEN10_L3BANK_MASK; + } else if (HAS_MSLICES(gt->i915)) { + MISSING_CASE(INTEL_INFO(gt->i915)->platform); } return intel_engines_init_mmio(gt); @@ -766,6 +786,24 @@ static void intel_gt_get_valid_steering(struct intel_gt *gt, *sliceid = 0; /* unused */ *subsliceid = __ffs(gt->info.l3bank_mask); break; + case MSLICE: + GEM_DEBUG_WARN_ON(!gt->info.mslice_mask); /* should be impossible! */ + + *sliceid = __ffs(gt->info.mslice_mask); + *subsliceid = 0;/* unused */ + break; + case LNCF: + GEM_DEBUG_WARN_ON(!gt->info.mslice_mask); /* should be impossible! */ + + /* +* 0xFDC[29:28] selects the mslice to steer to and 0xFDC[27] +* selects which LNCF within the mslice to steer to. An LNCF +* is always present if its mslice is present, so we can safely +* just steer to LNCF 0 in all cases. +*/ + *sliceid = __ffs(gt->info.mslice_mask) << 1; + *subsliceid = 0;/* unused */ + break; default: MISSING_CASE(type); *sliceid = 0; diff --git a/drivers/gpu/drm/i915/gt/intel_gt.h b/drivers/gpu/drm/i915/gt/intel_gt.h index e7aabe0cc5bf..f9bcde31f697 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt.h +++ b/drivers/gpu/drm/i915/gt/intel_gt.h @@ -82,6 +82,7 @@ static inline bool intel_gt_needs_read_steering(struct intel_gt *gt, } u32 intel_gt_read_register_fw(struct intel_gt *gt, i915_reg_t reg); +u32 intel_gt_read_register(struct intel_gt *gt, i915_reg_t reg); void intel_gt_info_print(const struct intel_gt_info *info, struct drm_printer *p); diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h b/drivers/gpu/drm/i915/gt/intel_gt_types.h index d93d578a4105..b06d8eaf12ea 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_types.h +++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h @@ -47,9 +47,14 @@ struct intel_mmio_range { * of multicast registers. If another type of steering does not have any * overlap in valid steering targets with 'subslice' style registers, we will * need to explicitly re-steer reads of registers of the othe
[PATCH 33/53] drm/i915/dg2: Add fake PCH
As with DG1, DG2 has an ICL-style south display interface provided on the same PCI device. Add a fake PCH to ensure DG2 takes the appropriate codepaths for south display handling. Bspec: 54871, 50062, 49961, 53673 Cc: Lucas De Marchi Signed-off-by: Matt Roper Signed-off-by: Aditya Swarup Signed-off-by: José Roberto de Souza --- drivers/gpu/drm/i915/i915_irq.c | 2 +- drivers/gpu/drm/i915/intel_pch.c | 3 +++ drivers/gpu/drm/i915/intel_pch.h | 2 ++ 3 files changed, 6 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index 9d47ffa39093..34a0d49e760e 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -208,7 +208,7 @@ static void intel_hpd_init_pins(struct drm_i915_private *dev_priv) (!HAS_PCH_SPLIT(dev_priv) || HAS_PCH_NOP(dev_priv))) return; - if (HAS_PCH_DG1(dev_priv)) + if (INTEL_PCH_TYPE(dev_priv) >= PCH_DG1) hpd->pch_hpd = hpd_sde_dg1; else if (INTEL_PCH_TYPE(dev_priv) >= PCH_ICP) hpd->pch_hpd = hpd_icp; diff --git a/drivers/gpu/drm/i915/intel_pch.c b/drivers/gpu/drm/i915/intel_pch.c index 4e92ae19189e..cc44164e242b 100644 --- a/drivers/gpu/drm/i915/intel_pch.c +++ b/drivers/gpu/drm/i915/intel_pch.c @@ -211,6 +211,9 @@ void intel_detect_pch(struct drm_i915_private *dev_priv) if (IS_DG1(dev_priv)) { dev_priv->pch_type = PCH_DG1; return; + } else if (IS_DG2(dev_priv)) { + dev_priv->pch_type = PCH_DG2; + return; } /* diff --git a/drivers/gpu/drm/i915/intel_pch.h b/drivers/gpu/drm/i915/intel_pch.h index e2f3f30c6445..7c0d83d292dc 100644 --- a/drivers/gpu/drm/i915/intel_pch.h +++ b/drivers/gpu/drm/i915/intel_pch.h @@ -30,6 +30,7 @@ enum intel_pch { /* Fake PCHs, functionality handled on the same PCI dev */ PCH_DG1 = 1024, + PCH_DG2, }; #define INTEL_PCH_DEVICE_ID_MASK 0xff80 @@ -62,6 +63,7 @@ enum intel_pch { #define INTEL_PCH_TYPE(dev_priv) ((dev_priv)->pch_type) #define INTEL_PCH_ID(dev_priv) ((dev_priv)->pch_id) +#define HAS_PCH_DG2(dev_priv) (INTEL_PCH_TYPE(dev_priv) == PCH_DG2) #define HAS_PCH_ADP(dev_priv) (INTEL_PCH_TYPE(dev_priv) == PCH_ADP) #define HAS_PCH_DG1(dev_priv) (INTEL_PCH_TYPE(dev_priv) == PCH_DG1) #define HAS_PCH_JSP(dev_priv) (INTEL_PCH_TYPE(dev_priv) == PCH_JSP) -- 2.25.4
[PATCH 18/53] drm/i915/xehpsdv: Add maximum sseu limits
Due to the removal of legacy slices and the transition to a gslice/cslice/mslice/etc. design, we'll internally store all DSS under "slice0." Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/gt/intel_sseu.c | 5 - drivers/gpu/drm/i915/gt/intel_sseu.h | 2 +- drivers/gpu/drm/i915/gt/intel_sseu_debugfs.c | 2 +- 3 files changed, 6 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.c b/drivers/gpu/drm/i915/gt/intel_sseu.c index 5d1b7d06c96b..16c0552fcd1d 100644 --- a/drivers/gpu/drm/i915/gt/intel_sseu.c +++ b/drivers/gpu/drm/i915/gt/intel_sseu.c @@ -145,7 +145,10 @@ static void gen12_sseu_info_init(struct intel_gt *gt) * across the entire device. Then calculate out the DSS for each * workload type within that software slice. */ - intel_sseu_set_info(sseu, 1, 6, 16); + if (IS_XEHPSDV(gt->i915)) + intel_sseu_set_info(sseu, 1, 32, 16); + else + intel_sseu_set_info(sseu, 1, 6, 16); /* * As mentioned above, Xe_HP does not have the concept of a slice. diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.h b/drivers/gpu/drm/i915/gt/intel_sseu.h index 74487650b08f..204ea6709460 100644 --- a/drivers/gpu/drm/i915/gt/intel_sseu.h +++ b/drivers/gpu/drm/i915/gt/intel_sseu.h @@ -16,7 +16,7 @@ struct intel_gt; struct drm_printer; #define GEN_MAX_SLICES (6) /* CNL upper bound */ -#define GEN_MAX_SUBSLICES (8) /* ICL upper bound */ +#define GEN_MAX_SUBSLICES (32) /* XEHPSDV upper bound */ #define GEN_SSEU_STRIDE(max_entries) DIV_ROUND_UP(max_entries, BITS_PER_BYTE) #define GEN_MAX_SUBSLICE_STRIDE GEN_SSEU_STRIDE(GEN_MAX_SUBSLICES) #define GEN_MAX_EUS(16) /* TGL upper bound */ diff --git a/drivers/gpu/drm/i915/gt/intel_sseu_debugfs.c b/drivers/gpu/drm/i915/gt/intel_sseu_debugfs.c index 714fe8495775..a424150b052e 100644 --- a/drivers/gpu/drm/i915/gt/intel_sseu_debugfs.c +++ b/drivers/gpu/drm/i915/gt/intel_sseu_debugfs.c @@ -53,7 +53,7 @@ static void cherryview_sseu_device_status(struct intel_gt *gt, static void gen10_sseu_device_status(struct intel_gt *gt, struct sseu_dev_info *sseu) { -#define SS_MAX 6 +#define SS_MAX 8 struct intel_uncore *uncore = gt->uncore; const struct intel_gt_info *info = >->info; u32 s_reg[SS_MAX], eu_reg[2 * SS_MAX], eu_mask[2]; -- 2.25.4
[PATCH 03/53] drm/i915: Fork DG1 interrupt handler
From: Paulo Zanoni The current interrupt handler is getting increasingly complicated and Xe_HP changes will bring even more complexity. Let's split off a new interrupt handler starting with DG1 (i.e., when the master tile interrupt register was added to the design) and use that as the basis for the new Xe_HP changes. Now that we track the hardware IP's release number as well as the version number, we can also properly define DG1 has version "12.10" and replace the has_master_unit_irq feature flag with an IP version test. Bspec: 50875 Cc: Daniele Spurio Ceraolo Cc: Stuart Summers Signed-off-by: Paulo Zanoni Signed-off-by: Lucas De Marchi Signed-off-by: Tomasz Lis Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/i915_drv.h | 2 - drivers/gpu/drm/i915/i915_irq.c | 139 +++ drivers/gpu/drm/i915/i915_pci.c | 2 +- drivers/gpu/drm/i915/i915_reg.h | 4 +- drivers/gpu/drm/i915/intel_device_info.h | 1 - 5 files changed, 95 insertions(+), 53 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 9639800485b9..519cce702f4b 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1601,8 +1601,6 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915, #define HAS_LOGICAL_RING_ELSQ(dev_priv) \ (INTEL_INFO(dev_priv)->has_logical_ring_elsq) -#define HAS_MASTER_UNIT_IRQ(dev_priv) (INTEL_INFO(dev_priv)->has_master_unit_irq) - #define HAS_EXECLISTS(dev_priv) HAS_LOGICAL_RING_CONTEXTS(dev_priv) #define INTEL_PPGTT(dev_priv) (INTEL_INFO(dev_priv)->ppgtt_type) diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index 7d0ce8b9f8ed..9d47ffa39093 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -2699,11 +2699,9 @@ gen11_display_irq_handler(struct drm_i915_private *i915) enable_rpm_wakeref_asserts(&i915->runtime_pm); } -static __always_inline irqreturn_t -__gen11_irq_handler(struct drm_i915_private * const i915, - u32 (*intr_disable)(void __iomem * const regs), - void (*intr_enable)(void __iomem * const regs)) +static irqreturn_t gen11_irq_handler(int irq, void *arg) { + struct drm_i915_private *i915 = arg; void __iomem * const regs = i915->uncore.regs; struct intel_gt *gt = &i915->gt; u32 master_ctl; @@ -2712,9 +2710,9 @@ __gen11_irq_handler(struct drm_i915_private * const i915, if (!intel_irqs_enabled(i915)) return IRQ_NONE; - master_ctl = intr_disable(regs); + master_ctl = gen11_master_intr_disable(regs); if (!master_ctl) { - intr_enable(regs); + gen11_master_intr_enable(regs); return IRQ_NONE; } @@ -2727,7 +2725,7 @@ __gen11_irq_handler(struct drm_i915_private * const i915, gu_misc_iir = gen11_gu_misc_irq_ack(gt, master_ctl); - intr_enable(regs); + gen11_master_intr_enable(regs); gen11_gu_misc_irq_handler(gt, gu_misc_iir); @@ -2736,51 +2734,69 @@ __gen11_irq_handler(struct drm_i915_private * const i915, return IRQ_HANDLED; } -static irqreturn_t gen11_irq_handler(int irq, void *arg) -{ - return __gen11_irq_handler(arg, - gen11_master_intr_disable, - gen11_master_intr_enable); -} - -static u32 dg1_master_intr_disable_and_ack(void __iomem * const regs) +static inline u32 dg1_master_intr_disable(void __iomem * const regs) { u32 val; /* First disable interrupts */ - raw_reg_write(regs, DG1_MSTR_UNIT_INTR, 0); + raw_reg_write(regs, DG1_MSTR_TILE_INTR, 0); /* Get the indication levels and ack the master unit */ - val = raw_reg_read(regs, DG1_MSTR_UNIT_INTR); + val = raw_reg_read(regs, DG1_MSTR_TILE_INTR); if (unlikely(!val)) return 0; - raw_reg_write(regs, DG1_MSTR_UNIT_INTR, val); - - /* -* Now with master disabled, get a sample of level indications -* for this interrupt and ack them right away - we keep GEN11_MASTER_IRQ -* out as this bit doesn't exist anymore for DG1 -*/ - val = raw_reg_read(regs, GEN11_GFX_MSTR_IRQ) & ~GEN11_MASTER_IRQ; - if (unlikely(!val)) - return 0; - - raw_reg_write(regs, GEN11_GFX_MSTR_IRQ, val); + raw_reg_write(regs, DG1_MSTR_TILE_INTR, val); return val; } static inline void dg1_master_intr_enable(void __iomem * const regs) { - raw_reg_write(regs, DG1_MSTR_UNIT_INTR, DG1_MSTR_IRQ); + raw_reg_write(regs, DG1_MSTR_TILE_INTR, DG1_MSTR_IRQ); } static irqreturn_t dg1_irq_handler(int irq, void *arg) { - return __gen11_irq_handler(arg, - dg1_master_intr_disable_and_ack, - dg1_master_intr_enable); + struct drm_i915_private
[PATCH 15/53] drm/i915/xehp: Loop over all gslices for INSTDONE processing
We no longer have traditional slices on Xe_HP platforms, but the INSTDONE registers are replicated according to gslice representation which is similar. We can mostly re-use the existing instdone code with just a few modifications: * Create an alternate instdone loop macro that will iterate over the flat DSS space, but still provide the gslice/dss steering values for compatibility with the legacy code. * We should allocate INSTDONE storage space according to the maximum number of gslices rather than the maximum number of legacy slices to ensure we have enough storage space to hold all of the values. XeHP design has 8 gslices, whereas older platforms never had more than 3 slices. Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/gt/intel_engine_cs.c| 48 +++- drivers/gpu/drm/i915/gt/intel_engine_types.h | 12 - drivers/gpu/drm/i915/gt/intel_sseu.h | 7 +++ drivers/gpu/drm/i915/i915_gpu_error.c| 32 + 4 files changed, 66 insertions(+), 33 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c index 6e2aa1acc4d4..e1302e9c168b 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -1181,16 +1181,16 @@ void intel_engine_get_instdone(const struct intel_engine_cs *engine, u32 mmio_base = engine->mmio_base; int slice; int subslice; + int iter; memset(instdone, 0, sizeof(*instdone)); - switch (GRAPHICS_VER(i915)) { - default: + if (GRAPHICS_VER(i915) >= 8) { instdone->instdone = intel_uncore_read(uncore, RING_INSTDONE(mmio_base)); if (engine->id != RCS0) - break; + return; instdone->slice_common = intel_uncore_read(uncore, GEN7_SC_INSTDONE); @@ -1200,21 +1200,32 @@ void intel_engine_get_instdone(const struct intel_engine_cs *engine, instdone->slice_common_extra[1] = intel_uncore_read(uncore, GEN12_SC_INSTDONE_EXTRA2); } - for_each_instdone_slice_subslice(i915, sseu, slice, subslice) { - instdone->sampler[slice][subslice] = - read_subslice_reg(engine, slice, subslice, - GEN7_SAMPLER_INSTDONE); - instdone->row[slice][subslice] = - read_subslice_reg(engine, slice, subslice, - GEN7_ROW_INSTDONE); + + if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) { + for_each_instdone_gslice_dss_xehp(i915, sseu, iter, slice, subslice) { + instdone->sampler[slice][subslice] = + read_subslice_reg(engine, slice, subslice, + GEN7_SAMPLER_INSTDONE); + instdone->row[slice][subslice] = + read_subslice_reg(engine, slice, subslice, + GEN7_ROW_INSTDONE); + } + } else { + for_each_instdone_slice_subslice(i915, sseu, slice, subslice) { + instdone->sampler[slice][subslice] = + read_subslice_reg(engine, slice, subslice, + GEN7_SAMPLER_INSTDONE); + instdone->row[slice][subslice] = + read_subslice_reg(engine, slice, subslice, + GEN7_ROW_INSTDONE); + } } - break; - case 7: + } else if (GRAPHICS_VER(i915) >= 7) { instdone->instdone = intel_uncore_read(uncore, RING_INSTDONE(mmio_base)); if (engine->id != RCS0) - break; + return; instdone->slice_common = intel_uncore_read(uncore, GEN7_SC_INSTDONE); @@ -1222,22 +1233,15 @@ void intel_engine_get_instdone(const struct intel_engine_cs *engine, intel_uncore_read(uncore, GEN7_SAMPLER_INSTDONE); instdone->row[0][0] = intel_uncore_read(uncore, GEN7_ROW_INSTDONE); - - break; - case 6: - case 5: - case 4: + } else if (GRAPHICS_VER(i915) >= 4) { instdone->instdone = intel_uncore_read(uncore, RING_INSTDONE(mmio_base)); if (engine->id == RCS0) /* HACK: Using the wrong struct mem
[PATCH 17/53] drm/i915/xehp: Changes to ss/eu definitions
From: Matthew Auld Xe_HP no longer has "slices" in the same way that old platforms did. There are new concepts (gslices, cslices, mslices) that apply in various contexts, but for the purposes of fusing slices no longer exist and we just have one large pool of dual-subslices (DSS) to work with. Furthermore, the meaning of the DSS fuse is inverted compared to past platforms --- it now specifies which DSS are enabled rather than which ones are disabled. Cc: Henryk Napiatek Cc: Rodrigo Vivi Cc: Lucas De Marchi Cc: Tvrtko Ursulin Signed-off-by: Matthew Auld Signed-off-by: Daniele Ceraolo Spurio Signed-off-by: Radhakrishna Sripada Signed-off-by: Stuart Summers Signed-off-by: Prasad Nallani Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/gt/intel_sseu.c | 24 drivers/gpu/drm/i915/i915_getparam.c | 6 -- drivers/gpu/drm/i915/i915_reg.h | 3 +++ 3 files changed, 27 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.c b/drivers/gpu/drm/i915/gt/intel_sseu.c index bbed8e8625e1..5d1b7d06c96b 100644 --- a/drivers/gpu/drm/i915/gt/intel_sseu.c +++ b/drivers/gpu/drm/i915/gt/intel_sseu.c @@ -139,17 +139,33 @@ static void gen12_sseu_info_init(struct intel_gt *gt) * Gen12 has Dual-Subslices, which behave similarly to 2 gen11 SS. * Instead of splitting these, provide userspace with an array * of DSS to more closely represent the hardware resource. +* +* In addition, the concept of slice has been removed in Xe_HP. +* To be compatible with prior generations, assume a single slice +* across the entire device. Then calculate out the DSS for each +* workload type within that software slice. */ intel_sseu_set_info(sseu, 1, 6, 16); - s_en = intel_uncore_read(uncore, GEN11_GT_SLICE_ENABLE) & - GEN11_GT_S_ENA_MASK; + /* +* As mentioned above, Xe_HP does not have the concept of a slice. +* Enable one for software backwards compatibility. +*/ + if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 50)) + s_en = 0x1; + else + s_en = intel_uncore_read(uncore, GEN11_GT_SLICE_ENABLE) & + GEN11_GT_S_ENA_MASK; dss_en = intel_uncore_read(uncore, GEN12_GT_DSS_ENABLE); /* one bit per pair of EUs */ - eu_en_fuse = ~(intel_uncore_read(uncore, GEN11_EU_DISABLE) & - GEN11_EU_DIS_MASK); + if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 50)) + eu_en_fuse = intel_uncore_read(uncore, XEHP_EU_ENABLE) & XEHP_EU_ENA_MASK; + else + eu_en_fuse = ~(intel_uncore_read(uncore, GEN11_EU_DISABLE) & + GEN11_EU_DIS_MASK); + for (eu = 0; eu < sseu->max_eus_per_subslice / 2; eu++) if (eu_en_fuse & BIT(eu)) eu_en |= BIT(eu * 2) | BIT(eu * 2 + 1); diff --git a/drivers/gpu/drm/i915/i915_getparam.c b/drivers/gpu/drm/i915/i915_getparam.c index 24e18219eb50..e289397d9178 100644 --- a/drivers/gpu/drm/i915/i915_getparam.c +++ b/drivers/gpu/drm/i915/i915_getparam.c @@ -15,7 +15,7 @@ int i915_getparam_ioctl(struct drm_device *dev, void *data, struct pci_dev *pdev = to_pci_dev(dev->dev); const struct sseu_dev_info *sseu = &i915->gt.info.sseu; drm_i915_getparam_t *param = data; - int value; + int value = 0; switch (param->param) { case I915_PARAM_IRQ_ACTIVE: @@ -150,7 +150,9 @@ int i915_getparam_ioctl(struct drm_device *dev, void *data, return -ENODEV; break; case I915_PARAM_SUBSLICE_MASK: - value = sseu->subslice_mask[0]; + /* Only copy bits from the first slice */ + memcpy(&value, sseu->subslice_mask, + min(sseu->ss_stride, (u8)sizeof(value))); if (!value) return -ENODEV; break; diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 43fdf63a2240..9edb58c796e8 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -3151,6 +3151,9 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg) #define GEN12_GT_DSS_ENABLE _MMIO(0x913C) +#define XEHP_EU_ENABLE _MMIO(0x9134) +#define XEHP_EU_ENA_MASK 0xFF + #define GEN6_BSD_SLEEP_PSMI_CONTROL_MMIO(0x12050) #define GEN6_BSD_SLEEP_MSG_DISABLE (1 << 0) #define GEN6_BSD_SLEEP_FLUSH_DISABLE (1 << 2) -- 2.25.4
[PATCH 42/53] drm/i915/dg2: Add MPLLB programming for SNPS PHY
DG2's SNPS PHYs incorporate a dedicated port PLL called MPLLB which takes the place of the shared DPLLs we've used on past platforms. Let's add the MPLLB programming sequences; they'll be plugged into the rest of the code in future patches. Bspec: 54032 Bspec: 53881 Cc: Lucas De Marchi Signed-off-by: Matt Roper Signed-off-by: Vandita Kulkarni Signed-off-by: Jani Nikula Signed-off-by: Nidhi Gupta --- drivers/gpu/drm/i915/Makefile | 1 + drivers/gpu/drm/i915/display/intel_display.c | 1 + .../drm/i915/display/intel_display_types.h| 17 +- drivers/gpu/drm/i915/display/intel_dpll.c | 12 +- drivers/gpu/drm/i915/display/intel_snps_phy.c | 517 ++ drivers/gpu/drm/i915/display/intel_snps_phy.h | 18 + drivers/gpu/drm/i915/i915_reg.h | 56 ++ 7 files changed, 616 insertions(+), 6 deletions(-) create mode 100644 drivers/gpu/drm/i915/display/intel_snps_phy.c create mode 100644 drivers/gpu/drm/i915/display/intel_snps_phy.h diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index 01f28ad5ea57..6b6c1e5a72d5 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -268,6 +268,7 @@ i915-y += \ display/intel_pps.o \ display/intel_qp_tables.o \ display/intel_sdvo.o \ + display/intel_snps_phy.o \ display/intel_tv.o \ display/intel_vdsc.o \ display/intel_vrr.o \ diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c index dc2b943a4e72..91f6964ec406 100644 --- a/drivers/gpu/drm/i915/display/intel_display.c +++ b/drivers/gpu/drm/i915/display/intel_display.c @@ -59,6 +59,7 @@ #include "display/intel_hdmi.h" #include "display/intel_lvds.h" #include "display/intel_sdvo.h" +#include "display/intel_snps_phy.h" #include "display/intel_tv.h" #include "display/intel_vdsc.h" #include "display/intel_vrr.h" diff --git a/drivers/gpu/drm/i915/display/intel_display_types.h b/drivers/gpu/drm/i915/display/intel_display_types.h index d94f361b548b..29ae1d9b5abc 100644 --- a/drivers/gpu/drm/i915/display/intel_display_types.h +++ b/drivers/gpu/drm/i915/display/intel_display_types.h @@ -884,6 +884,18 @@ enum intel_output_format { INTEL_OUTPUT_FORMAT_YCBCR444, }; +struct intel_mpllb_state { + u32 clock; /* in KHz */ + u32 ref_control; + u32 mpllb_cp; + u32 mpllb_div; + u32 mpllb_div2; + u32 mpllb_fracn1; + u32 mpllb_fracn2; + u32 mpllb_sscen; + u32 mpllb_sscstep; +}; + struct intel_crtc_state { /* * uapi (drm) state. This is the software state shown to userspace. @@ -1018,7 +1030,10 @@ struct intel_crtc_state { struct intel_shared_dpll *shared_dpll; /* Actual register state of the dpll, for shared dpll cross-checking. */ - struct intel_dpll_hw_state dpll_hw_state; + union { + struct intel_dpll_hw_state dpll_hw_state; + struct intel_mpllb_state mpllb_state; + }; /* * ICL reserved DPLLs for the CRTC/port. The active PLL is selected by diff --git a/drivers/gpu/drm/i915/display/intel_dpll.c b/drivers/gpu/drm/i915/display/intel_dpll.c index 89635da9f6f6..14515e62c05e 100644 --- a/drivers/gpu/drm/i915/display/intel_dpll.c +++ b/drivers/gpu/drm/i915/display/intel_dpll.c @@ -11,6 +11,7 @@ #include "intel_lvds.h" #include "intel_panel.h" #include "intel_sideband.h" +#include "display/intel_snps_phy.h" struct intel_limit { struct { @@ -923,12 +924,13 @@ static int hsw_crtc_compute_clock(struct intel_crtc *crtc, struct drm_i915_private *dev_priv = to_i915(crtc->base.dev); struct intel_atomic_state *state = to_intel_atomic_state(crtc_state->uapi.state); + struct intel_encoder *encoder = + intel_get_crtc_new_encoder(state, crtc_state); - if (!intel_crtc_has_type(crtc_state, INTEL_OUTPUT_DSI) || - DISPLAY_VER(dev_priv) >= 11) { - struct intel_encoder *encoder = - intel_get_crtc_new_encoder(state, crtc_state); - + if (IS_DG2(dev_priv)) { + return intel_mpllb_calc_state(crtc_state, encoder); + } else if (!intel_crtc_has_type(crtc_state, INTEL_OUTPUT_DSI) || + DISPLAY_VER(dev_priv) >= 11) { if (!intel_reserve_shared_dplls(state, crtc, encoder)) { drm_dbg_kms(&dev_priv->drm, "failed to find PLL for pipe %c\n", diff --git a/drivers/gpu/drm/i915/display/intel_snps_phy.c b/drivers/gpu/drm/i915/display/intel_snps_phy.c new file mode 100644 index ..6d9205906595 --- /dev/null +++ b/drivers/gpu/drm/i915/display/intel_snps_phy.c @@ -0,0 +1,517 @@ +// SPDX-License-Identifier: MIT +/* + * Copyright © 2019 Intel Corporation + */ + +#include "intel_de.h" +#include "intel_display_types.h" +#include "intel_snps_phy.h" + +/** + * DOC: Syno
[PATCH 34/53] drm/i915/dg2: Add cdclk table and reference clock
Note that DG2 only has a single possible refclk frequency (38.4 MHz). Bspec: 54034 Cc: Lucas De Marchi Signed-off-by: Anusha Srivatsa Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/display/intel_cdclk.c | 24 -- 1 file changed, 22 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/display/intel_cdclk.c b/drivers/gpu/drm/i915/display/intel_cdclk.c index 613ffcc68eba..08f34d87684f 100644 --- a/drivers/gpu/drm/i915/display/intel_cdclk.c +++ b/drivers/gpu/drm/i915/display/intel_cdclk.c @@ -1290,6 +1290,18 @@ static const struct intel_cdclk_vals adlp_cdclk_table[] = { {} }; +static const struct intel_cdclk_vals dg2_cdclk_table[] = { + { .refclk = 38400, .cdclk = 172800, .divider = 2, .ratio = 9 }, + { .refclk = 38400, .cdclk = 179200, .divider = 3, .ratio = 14 }, + { .refclk = 38400, .cdclk = 192000, .divider = 2, .ratio = 10 }, + { .refclk = 38400, .cdclk = 192000, .divider = 3, .ratio = 15 }, + { .refclk = 38400, .cdclk = 307200, .divider = 2, .ratio = 16 }, + { .refclk = 38400, .cdclk = 326400, .divider = 4, .ratio = 34 }, + { .refclk = 38400, .cdclk = 556800, .divider = 2, .ratio = 29 }, + { .refclk = 38400, .cdclk = 652800, .divider = 2, .ratio = 34 }, + {} +}; + static int bxt_calc_cdclk(struct drm_i915_private *dev_priv, int min_cdclk) { const struct intel_cdclk_vals *table = dev_priv->cdclk.table; @@ -1408,7 +1420,9 @@ static void bxt_de_pll_readout(struct drm_i915_private *dev_priv, { u32 val, ratio; - if (DISPLAY_VER(dev_priv) >= 11) + if (IS_DG2(dev_priv)) + cdclk_config->ref = 38400; + else if (DISPLAY_VER(dev_priv) >= 11) icl_readout_refclk(dev_priv, cdclk_config); else if (IS_CANNONLAKE(dev_priv)) cnl_readout_refclk(dev_priv, cdclk_config); @@ -2878,7 +2892,13 @@ u32 intel_read_rawclk(struct drm_i915_private *dev_priv) */ void intel_init_cdclk_hooks(struct drm_i915_private *dev_priv) { - if (IS_ALDERLAKE_P(dev_priv)) { + if (IS_DG2(dev_priv)) { + dev_priv->display.set_cdclk = bxt_set_cdclk; + dev_priv->display.bw_calc_min_cdclk = skl_bw_calc_min_cdclk; + dev_priv->display.modeset_calc_cdclk = bxt_modeset_calc_cdclk; + dev_priv->display.calc_voltage_level = tgl_calc_voltage_level; + dev_priv->cdclk.table = dg2_cdclk_table; + } else if (IS_ALDERLAKE_P(dev_priv)) { dev_priv->display.set_cdclk = bxt_set_cdclk; dev_priv->display.bw_calc_min_cdclk = skl_bw_calc_min_cdclk; dev_priv->display.modeset_calc_cdclk = bxt_modeset_calc_cdclk; -- 2.25.4
[PATCH 13/53] drm/i915/xehp: New engine context offsets
From: Prathap Kumar Valsan The layout of some engine contexts has changed on Xe_HP. Define the new offsets. Bspec: 45585, 46256 Signed-off-by: Prathap Kumar Valsan Signed-off-by: Ramalingam C Signed-off-by: Venkata Ramana Nayana Signed-off-by: Akeem G Abodunrin Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/gt/intel_lrc.c | 65 ++--- 1 file changed, 59 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index e1c80e2c06d8..fee735e2a524 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -484,6 +484,47 @@ static const u8 gen12_rcs_offsets[] = { END }; +static const u8 xehp_rcs_offsets[] = { + NOP(1), + LRI(13, POSTED), + REG16(0x244), + REG(0x034), + REG(0x030), + REG(0x038), + REG(0x03c), + REG(0x168), + REG(0x140), + REG(0x110), + REG(0x1c0), + REG(0x1c4), + REG(0x1c8), + REG(0x180), + REG16(0x2b4), + + NOP(5), + LRI(9, POSTED), + REG16(0x3a8), + REG16(0x28c), + REG16(0x288), + REG16(0x284), + REG16(0x280), + REG16(0x27c), + REG16(0x278), + REG16(0x274), + REG16(0x270), + + LRI(3, POSTED), + REG(0x1b0), + REG16(0x5a8), + REG16(0x5ac), + + NOP(6), + LRI(1, 0), + REG(0x0c8), + + END +}; + #undef END #undef REG16 #undef REG @@ -502,7 +543,9 @@ static const u8 *reg_offsets(const struct intel_engine_cs *engine) !intel_engine_has_relative_mmio(engine)); if (engine->class == RENDER_CLASS) { - if (GRAPHICS_VER(engine->i915) >= 12) + if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 50)) + return xehp_rcs_offsets; + else if (GRAPHICS_VER(engine->i915) >= 12) return gen12_rcs_offsets; else if (GRAPHICS_VER(engine->i915) >= 11) return gen11_rcs_offsets; @@ -522,7 +565,9 @@ static const u8 *reg_offsets(const struct intel_engine_cs *engine) static int lrc_ring_mi_mode(const struct intel_engine_cs *engine) { - if (GRAPHICS_VER(engine->i915) >= 12) + if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 50)) + return 0x70; + else if (GRAPHICS_VER(engine->i915) >= 12) return 0x60; else if (GRAPHICS_VER(engine->i915) >= 9) return 0x54; @@ -534,7 +579,9 @@ static int lrc_ring_mi_mode(const struct intel_engine_cs *engine) static int lrc_ring_gpr0(const struct intel_engine_cs *engine) { - if (GRAPHICS_VER(engine->i915) >= 12) + if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 50)) + return 0x84; + else if (GRAPHICS_VER(engine->i915) >= 12) return 0x74; else if (GRAPHICS_VER(engine->i915) >= 9) return 0x68; @@ -578,10 +625,16 @@ static int lrc_ring_indirect_offset(const struct intel_engine_cs *engine) static int lrc_ring_cmd_buf_cctl(const struct intel_engine_cs *engine) { - if (engine->class != RENDER_CLASS) - return -1; - if (GRAPHICS_VER(engine->i915) >= 12) + if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 50)) + /* +* Note that the CSFE context has a dummy slot for CMD_BUF_CCTL +* simply to match the RCS context image layout. +*/ + return 0xc6; + else if (engine->class != RENDER_CLASS) + return -1; + else if (GRAPHICS_VER(engine->i915) >= 12) return 0xb6; else if (GRAPHICS_VER(engine->i915) >= 11) return 0xaa; -- 2.25.4
[PATCH 52/53] drm/i915/dg2: Update to bigjoiner path
From: Animesh Manna In verify_mpllb_state() encoder is retrieved from best_encoder of connector_state. As there will be only one connector_state for bigjoiner and checking encoder may not be needed for bigjoiner-slave. This code path related to mpll is done on dg2 and need this fix to avoid null pointer dereference issue. Cc: Manasi Navare Signed-off-by: Animesh Manna Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/display/intel_display.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c index 9655f1b1b41b..3f4e811145b6 100644 --- a/drivers/gpu/drm/i915/display/intel_display.c +++ b/drivers/gpu/drm/i915/display/intel_display.c @@ -9153,6 +9153,9 @@ verify_mpllb_state(struct intel_atomic_state *state, if (!new_crtc_state->hw.active) return; + if (new_crtc_state->bigjoiner_slave) + return; + encoder = intel_get_crtc_new_encoder(state, new_crtc_state); intel_mpllb_readout_hw_state(encoder, &mpllb_hw_state); -- 2.25.4
[PATCH 25/53] drm/i915/dg2: DG2 uses the same sseu limits as XeHP SDV
DG2 supports compute DSS and has the same maximum number of DSS and EU as XeHP SDV. Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/gt/intel_sseu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.c b/drivers/gpu/drm/i915/gt/intel_sseu.c index 5d3b8dff464c..eaff221db5b0 100644 --- a/drivers/gpu/drm/i915/gt/intel_sseu.c +++ b/drivers/gpu/drm/i915/gt/intel_sseu.c @@ -171,7 +171,7 @@ static void gen12_sseu_info_init(struct intel_gt *gt) * across the entire device. Then calculate out the DSS for each * workload type within that software slice. */ - if (IS_XEHPSDV(gt->i915)) { + if (IS_DG2(gt->i915) || IS_XEHPSDV(gt->i915)) { intel_sseu_set_info(sseu, 1, 32, 16); sseu->has_compute_dss = 1; } else { -- 2.25.4
[PATCH 16/53] drm/i915/xehpsdv: add initial XeHP SDV definitions
From: Lucas De Marchi XeHP SDV is a Intel® dGPU without display. This is just the definition of some basic platform macros, by large a copy of current state of Tigerlake which does not reflect the end state of this platform. Bspec: 44467, 48077 Cc: Rodrigo Vivi Signed-off-by: Lucas De Marchi Signed-off-by: Daniele Ceraolo Spurio Signed-off-by: José Roberto de Souza Signed-off-by: Stuart Summers Signed-off-by: Tomas Winkler Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/i915_drv.h | 10 ++ drivers/gpu/drm/i915/i915_pci.c | 20 drivers/gpu/drm/i915/intel_device_info.c | 1 + drivers/gpu/drm/i915/intel_device_info.h | 1 + 4 files changed, 32 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index c02600850246..63bed18a2be7 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1406,6 +1406,7 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915, #define IS_DG1(dev_priv)IS_PLATFORM(dev_priv, INTEL_DG1) #define IS_ALDERLAKE_S(dev_priv) IS_PLATFORM(dev_priv, INTEL_ALDERLAKE_S) #define IS_ALDERLAKE_P(dev_priv) IS_PLATFORM(dev_priv, INTEL_ALDERLAKE_P) +#define IS_XEHPSDV(dev_priv) IS_PLATFORM(dev_priv, INTEL_XEHPSDV) #define IS_HSW_EARLY_SDV(dev_priv) (IS_HASWELL(dev_priv) && \ (INTEL_DEVID(dev_priv) & 0xFF00) == 0x0C00) #define IS_BDW_ULT(dev_priv) \ @@ -1564,6 +1565,15 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915, (IS_ALDERLAKE_P(__i915) && \ IS_GT_STEP(__i915, since, until)) +#define XEHPSDV_REVID_A0 0x0 +#define XEHPSDV_REVID_A1 0x1 +#define XEHPSDV_REVID_A_LAST XEHPSDV_REVID_A1 +#define XEHPSDV_REVID_B0 0x4 +#define XEHPSDV_REVID_C0 0x8 + +#define IS_XEHPSDV_REVID(p, since, until) \ + (IS_XEHPSDV(p) && IS_REVID(p, since, until)) + #define IS_LP(dev_priv)(INTEL_INFO(dev_priv)->is_lp) #define IS_GEN9_LP(dev_priv) (GRAPHICS_VER(dev_priv) == 9 && IS_LP(dev_priv)) #define IS_GEN9_BC(dev_priv) (GRAPHICS_VER(dev_priv) == 9 && !IS_LP(dev_priv)) diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c index 88b279452b87..046309e95f43 100644 --- a/drivers/gpu/drm/i915/i915_pci.c +++ b/drivers/gpu/drm/i915/i915_pci.c @@ -1020,6 +1020,26 @@ static const struct intel_device_info adl_p_info = { .ppgtt_size = 48, \ .ppgtt_type = INTEL_PPGTT_FULL +#define XE_HPM_FEATURES \ + .media_ver = 12, \ + .media_ver_release = 50 + +__maybe_unused +static const struct intel_device_info xehpsdv_info = { + XE_HP_FEATURES, + XE_HPM_FEATURES, + DGFX_FEATURES, + PLATFORM(INTEL_XEHPSDV), + .display = { }, + .pipe_mask = 0, + .platform_engine_mask = + BIT(RCS0) | BIT(BCS0) | + BIT(VECS0) | BIT(VECS1) | BIT(VECS2) | BIT(VECS3) | + BIT(VCS0) | BIT(VCS1) | BIT(VCS2) | BIT(VCS3) | + BIT(VCS4) | BIT(VCS5) | BIT(VCS6) | BIT(VCS7), + .require_force_probe = 1, +}; + #undef PLATFORM /* diff --git a/drivers/gpu/drm/i915/intel_device_info.c b/drivers/gpu/drm/i915/intel_device_info.c index e8ad14f002c1..7b37b68f4548 100644 --- a/drivers/gpu/drm/i915/intel_device_info.c +++ b/drivers/gpu/drm/i915/intel_device_info.c @@ -68,6 +68,7 @@ static const char * const platform_names[] = { PLATFORM_NAME(DG1), PLATFORM_NAME(ALDERLAKE_S), PLATFORM_NAME(ALDERLAKE_P), + PLATFORM_NAME(XEHPSDV), }; #undef PLATFORM_NAME diff --git a/drivers/gpu/drm/i915/intel_device_info.h b/drivers/gpu/drm/i915/intel_device_info.h index f824de632cfe..e8684199b0c9 100644 --- a/drivers/gpu/drm/i915/intel_device_info.h +++ b/drivers/gpu/drm/i915/intel_device_info.h @@ -88,6 +88,7 @@ enum intel_platform { INTEL_DG1, INTEL_ALDERLAKE_S, INTEL_ALDERLAKE_P, + INTEL_XEHPSDV, INTEL_MAX_PLATFORMS }; -- 2.25.4
[PATCH 28/53] drm/i915/dg2: Add SQIDI steering
Although DG2_G10 platforms will always have all SQIDI's present and don't need steering for registers in a SQIDI MMIO range, this isn't true for DG2_G11 platforms; only SQIDI's 2 and 3 can be used on those. We handle SQIDI ranges a bit differently from other types of explicit steering. The SQIDI ranges belong to either the MCFG unit or the SF unit, both of which have their own dedicated steering registers and do not use the typical 0xFDC steering control that all other types of ranges use. Thus we only need to worry about picking a valid initial value for the MCFG and SF steering registers (0xFD0 and 0xFD8 resepectively) at driver init; they won't change after we set them up so we don't need to worry about re-steering them explicitly at runtime. Given that any SQIDI value should work fine for DG2-G10 and XeHP SDV, while only values of 2 and 3 are valid for DG2-G11, we'll just initialize the MCFG and SF steering registers to a constant value of "2" for all XeHP-based platforms for simplicity --- that will work in all cases. Bspec: 66534 Cc: Radhakrishna Sripada Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/gt/intel_workarounds.c | 28 + drivers/gpu/drm/i915/i915_reg.h | 2 ++ 2 files changed, 25 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c index 4302dc1b728e..f97ff2848122 100644 --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c @@ -944,17 +944,24 @@ cfl_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal) GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS); } -static void __add_mcr_wa(struct drm_i915_private *i915, struct i915_wa_list *wal, -unsigned slice, unsigned subslice) +static void __set_mcr_steering(struct i915_wa_list *wal, + i915_reg_t steering_reg, + unsigned int slice, unsigned int subslice) { u32 mcr, mcr_mask; mcr = GEN11_MCR_SLICE(slice) | GEN11_MCR_SUBSLICE(subslice); mcr_mask = GEN11_MCR_SLICE_MASK | GEN11_MCR_SUBSLICE_MASK; - drm_dbg(&i915->drm, "MCR slice/subslice = %x\n", mcr); + wa_write_clr_set(wal, steering_reg, mcr_mask, mcr); +} + +static void __add_mcr_wa(struct drm_i915_private *i915, struct i915_wa_list *wal, +unsigned int slice, unsigned int subslice) +{ + drm_dbg(&i915->drm, "MCR slice=0x%x, subslice=0x%x\n", slice, subslice); - wa_write_clr_set(wal, GEN8_MCR_SELECTOR, mcr_mask, mcr); + __set_mcr_steering(wal, GEN8_MCR_SELECTOR, slice, subslice); } static void @@ -1008,7 +1015,6 @@ xehp_init_mcr(struct intel_gt *gt, struct i915_wa_list *wal) * - L3 Bank (fusable) * - MSLICE (fusable) * - LNCF (sub-unit within mslice; always present if mslice is present) -* - SQIDI (always on) * * We'll do our default/implicit steering based on GSLICE (in the * sliceid field) and DSS (in the subsliceid field). If we can @@ -1058,6 +1064,18 @@ xehp_init_mcr(struct intel_gt *gt, struct i915_wa_list *wal) WARN_ON(dss_mask >> (slice * GEN_DSS_PER_GSLICE) == 0); __add_mcr_wa(i915, wal, slice, subslice); + + /* +* SQIDI ranges are special because they use different steering +* registers than everything else we work with. On XeHP SDV and +* DG2-G10, any value in the steering registers will work fine since +* all instances are present, but DG2-G11 only has SQIDI instances at +* ID's 2 and 3, so we need to steer to one of those. For simplicity +* we'll just steer to a hardcoded "2" since that value will work +* everywhere. +*/ + __set_mcr_steering(wal, MCFG_MCR_SELECTOR, 0, 2); + __set_mcr_steering(wal, SF_MCR_SELECTOR, 0, 2); } static void diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 2992e8585399..b19d102e0a01 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -2686,6 +2686,8 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg) #define GEN12_SC_INSTDONE_EXTRA2 _MMIO(0x7108) #define GEN7_SAMPLER_INSTDONE _MMIO(0xe160) #define GEN7_ROW_INSTDONE _MMIO(0xe164) +#define MCFG_MCR_SELECTOR _MMIO(0xfd0) +#define SF_MCR_SELECTOR_MMIO(0xfd8) #define GEN8_MCR_SELECTOR _MMIO(0xfdc) #define GEN8_MCR_SLICE(slice)(((slice) & 3) << 26) #define GEN8_MCR_SLICE_MASK GEN8_MCR_SLICE(3) -- 2.25.4
[PATCH 19/53] drm/i915/xehpsdv: Add compute DSS type
From: Stuart Summers Starting in XeHP, the concept of slice has been removed in favor of DSS (Dual-Subslice) masks for various workload types. These workloads have been divided into those enabled for geometry and those enabled for compute. i915 currently maintains a single set of S/SS/EU masks for the device. The goal of this patch set is to minimize the amount of impact to prior generations while still giving the user maximum flexibility. Bspec: 33117, 33118, 20376 Cc: Daniele Ceraolo Spurio Cc: Matt Roper Signed-off-by: Stuart Summers Signed-off-by: Steve Hampson Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/gt/intel_sseu.c | 73 drivers/gpu/drm/i915/gt/intel_sseu.h | 5 +- drivers/gpu/drm/i915/i915_reg.h | 3 +- include/uapi/drm/i915_drm.h | 3 -- 4 files changed, 59 insertions(+), 25 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.c b/drivers/gpu/drm/i915/gt/intel_sseu.c index 16c0552fcd1d..5d3b8dff464c 100644 --- a/drivers/gpu/drm/i915/gt/intel_sseu.c +++ b/drivers/gpu/drm/i915/gt/intel_sseu.c @@ -46,11 +46,11 @@ u32 intel_sseu_get_subslices(const struct sseu_dev_info *sseu, u8 slice) } void intel_sseu_set_subslices(struct sseu_dev_info *sseu, int slice, - u32 ss_mask) + u8 *subslice_mask, u32 ss_mask) { int offset = slice * sseu->ss_stride; - memcpy(&sseu->subslice_mask[offset], &ss_mask, sseu->ss_stride); + memcpy(&subslice_mask[offset], &ss_mask, sseu->ss_stride); } unsigned int @@ -100,14 +100,24 @@ static u16 compute_eu_total(const struct sseu_dev_info *sseu) return total; } -static void gen11_compute_sseu_info(struct sseu_dev_info *sseu, - u8 s_en, u32 ss_en, u16 eu_en) +static u32 get_ss_stride_mask(struct sseu_dev_info *sseu, u8 s, u32 ss_en) +{ + u32 ss_mask; + + ss_mask = ss_en >> (s * sseu->max_subslices); + ss_mask &= GENMASK(sseu->max_subslices - 1, 0); + + return ss_mask; +} + +static void gen11_compute_sseu_info(struct sseu_dev_info *sseu, u8 s_en, + u32 g_ss_en, u32 c_ss_en, u16 eu_en) { int s, ss; - /* ss_en represents entire subslice mask across all slices */ + /* g_ss_en/c_ss_en represent entire subslice mask across all slices */ GEM_BUG_ON(sseu->max_slices * sseu->max_subslices > - sizeof(ss_en) * BITS_PER_BYTE); + sizeof(g_ss_en) * BITS_PER_BYTE); for (s = 0; s < sseu->max_slices; s++) { if ((s_en & BIT(s)) == 0) @@ -115,7 +125,23 @@ static void gen11_compute_sseu_info(struct sseu_dev_info *sseu, sseu->slice_mask |= BIT(s); - intel_sseu_set_subslices(sseu, s, ss_en); + /* +* XeHP introduces the concept of compute vs +* geometry DSS. To reduce variation between GENs +* around subslice usage, store a mask for both the +* geometry and compute enabled masks, to provide +* to user space later in QUERY_TOPOLOGY_INFO, and +* compute a total enabled subslice count for the +* purposes of selecting subslices to use in a +* particular GEM context. +*/ + intel_sseu_set_subslices(sseu, s, sseu->compute_subslice_mask, +get_ss_stride_mask(sseu, s, c_ss_en)); + intel_sseu_set_subslices(sseu, s, sseu->geometry_subslice_mask, +get_ss_stride_mask(sseu, s, g_ss_en)); + intel_sseu_set_subslices(sseu, s, sseu->subslice_mask, +get_ss_stride_mask(sseu, s, + g_ss_en | c_ss_en)); for (ss = 0; ss < sseu->max_subslices; ss++) if (intel_sseu_has_subslice(sseu, s, ss)) @@ -129,7 +155,7 @@ static void gen12_sseu_info_init(struct intel_gt *gt) { struct sseu_dev_info *sseu = >->info.sseu; struct intel_uncore *uncore = gt->uncore; - u32 dss_en; + u32 g_dss_en, c_dss_en = 0; u16 eu_en = 0; u8 eu_en_fuse; u8 s_en; @@ -145,10 +171,12 @@ static void gen12_sseu_info_init(struct intel_gt *gt) * across the entire device. Then calculate out the DSS for each * workload type within that software slice. */ - if (IS_XEHPSDV(gt->i915)) + if (IS_XEHPSDV(gt->i915)) { intel_sseu_set_info(sseu, 1, 32, 16); - else + sseu->has_compute_dss = 1; + } else { intel_sseu_set_info(sseu, 1, 6, 16); + } /* * As mentioned above, Xe_HP does not have the concept of a slice. @@ -160,7 +188,9 @@ static void gen12_sseu_info_init(struct intel_gt *gt) s_en = intel
[PATCH 09/53] drm/i915/xehp: Extra media engines - Part 3 (reset)
From: John Harrison Xe_HP can have a lot of extra media engines. This patch adds the reset support for them. Signed-off-by: John Harrison Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/gt/intel_reset.c | 6 ++ drivers/gpu/drm/i915/i915_reg.h | 8 2 files changed, 14 insertions(+) diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c index 72251638d4ea..9586613ee399 100644 --- a/drivers/gpu/drm/i915/gt/intel_reset.c +++ b/drivers/gpu/drm/i915/gt/intel_reset.c @@ -515,8 +515,14 @@ static int gen11_reset_engines(struct intel_gt *gt, [VCS1] = GEN11_GRDOM_MEDIA2, [VCS2] = GEN11_GRDOM_MEDIA3, [VCS3] = GEN11_GRDOM_MEDIA4, + [VCS4] = GEN11_GRDOM_MEDIA5, + [VCS5] = GEN11_GRDOM_MEDIA6, + [VCS6] = GEN11_GRDOM_MEDIA7, + [VCS7] = GEN11_GRDOM_MEDIA8, [VECS0] = GEN11_GRDOM_VECS, [VECS1] = GEN11_GRDOM_VECS2, + [VECS2] = GEN11_GRDOM_VECS3, + [VECS3] = GEN11_GRDOM_VECS4, }; struct intel_engine_cs *engine; intel_engine_mask_t tmp; diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index cb1716b6ce72..dbc233442dd0 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -395,10 +395,18 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg) #define GEN11_GRDOM_MEDIA2(1 << 6) #define GEN11_GRDOM_MEDIA3(1 << 7) #define GEN11_GRDOM_MEDIA4(1 << 8) +#define GEN11_GRDOM_MEDIA5(1 << 9) +#define GEN11_GRDOM_MEDIA6(1 << 10) +#define GEN11_GRDOM_MEDIA7(1 << 11) +#define GEN11_GRDOM_MEDIA8(1 << 12) #define GEN11_GRDOM_VECS (1 << 13) #define GEN11_GRDOM_VECS2 (1 << 14) +#define GEN11_GRDOM_VECS3 (1 << 15) +#define GEN11_GRDOM_VECS4 (1 << 16) #define GEN11_GRDOM_SFC0 (1 << 17) #define GEN11_GRDOM_SFC1 (1 << 18) +#define GEN11_GRDOM_SFC2 (1 << 19) +#define GEN11_GRDOM_SFC3 (1 << 20) #define GEN11_VCS_SFC_RESET_BIT(instance) (GEN11_GRDOM_SFC0 << ((instance) >> 1)) #define GEN11_VECS_SFC_RESET_BIT(instance)(GEN11_GRDOM_SFC0 << (instance)) -- 2.25.4
[PATCH 27/53] drm/i915/dg2: Update LNCF steering ranges
DG2's replicated register ranges are almost the same at XeHP SDV with the exception of one LNCF sub-range that switches to gslice steering. We can re-use the XeHP SDV mslice steering table and just provide a DG2-specific LNCF steering table. Bspec: 66534 Cc: Daniele Ceraolo Spurio Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/gt/intel_gt.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c index 9d1c99c9c0dd..d640fd37792f 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt.c +++ b/drivers/gpu/drm/i915/gt/intel_gt.c @@ -103,6 +103,12 @@ static const struct intel_mmio_range xehpsdv_lncf_steering_table[] = { {}, }; +static const struct intel_mmio_range dg2_lncf_steering_table[] = { + { 0x00B000, 0x00B0FF }, + { 0x00D880, 0x00D8FF }, + {}, +}; + static u16 slicemask(struct intel_gt *gt, int count) { u64 dss_mask = intel_sseu_get_subslices(>->info.sseu, 0); @@ -127,7 +133,10 @@ int intel_gt_init_mmio(struct intel_gt *gt) (intel_uncore_read(gt->uncore, GEN10_MIRROR_FUSE3) & GEN12_MEML3_EN_MASK); - if (IS_XEHPSDV(gt->i915)) { + if (IS_DG2(gt->i915)) { + gt->steering_table[MSLICE] = xehpsdv_mslice_steering_table; + gt->steering_table[LNCF] = dg2_lncf_steering_table; + } else if (IS_XEHPSDV(gt->i915)) { gt->steering_table[MSLICE] = xehpsdv_mslice_steering_table; gt->steering_table[LNCF] = xehpsdv_lncf_steering_table; } else if (GRAPHICS_VER(gt->i915) >= 11 && -- 2.25.4
[PATCH 24/53] drm/i915/dg2: add DG2 platform info
DG2 has Xe_LPD display (version 13) and Xe_HPG (version 12.55) graphics. There are two variants (treated as subplatforms in the code): DG2-G10 and DG2-G11 that require independent programming in some areas (e.g., workarounds). Bspec: 44472, 44474, 46197, 48028, 48077 Cc: Anusha Srivatsa Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/i915_drv.h | 27 drivers/gpu/drm/i915/i915_pci.c | 16 ++ drivers/gpu/drm/i915/intel_device_info.c | 1 + drivers/gpu/drm/i915/intel_device_info.h | 5 + drivers/gpu/drm/i915/intel_step.c| 20 +- drivers/gpu/drm/i915/intel_step.h| 1 + 6 files changed, 69 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 63bed18a2be7..828ad607795a 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1407,6 +1407,11 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915, #define IS_ALDERLAKE_S(dev_priv) IS_PLATFORM(dev_priv, INTEL_ALDERLAKE_S) #define IS_ALDERLAKE_P(dev_priv) IS_PLATFORM(dev_priv, INTEL_ALDERLAKE_P) #define IS_XEHPSDV(dev_priv) IS_PLATFORM(dev_priv, INTEL_XEHPSDV) +#define IS_DG2(dev_priv) IS_PLATFORM(dev_priv, INTEL_DG2) +#define IS_DG2_G10(dev_priv) \ + IS_SUBPLATFORM(dev_priv, INTEL_DG2, INTEL_SUBPLATFORM_G10) +#define IS_DG2_G11(dev_priv) \ + IS_SUBPLATFORM(dev_priv, INTEL_DG2, INTEL_SUBPLATFORM_G11) #define IS_HSW_EARLY_SDV(dev_priv) (IS_HASWELL(dev_priv) && \ (INTEL_DEVID(dev_priv) & 0xFF00) == 0x0C00) #define IS_BDW_ULT(dev_priv) \ @@ -1574,6 +1579,28 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915, #define IS_XEHPSDV_REVID(p, since, until) \ (IS_XEHPSDV(p) && IS_REVID(p, since, until)) +/* + * DG2 hardware steppings are a bit unusual. The hardware design was forked + * to create two variants (G10 and G11) which have distinct workaround sets. + * The G11 fork of the DG2 design resets the GT stepping back to "A0" for its + * first iteration, even though it's more similar to a G10 B0 stepping in terms + * of functionality and workarounds. However the display stepping does not + * reset in the same manner --- a specific stepping like "B0" has a consistent + * meaning regardless of whether it belongs to a G10 or G11 DG2. + * + * TLDR: All GT workarounds and stepping-specific logic must be applied in + * relation to a specific subplatform (G10 or G11), whereas display workarounds + * and stepping-specific logic will be applied with a general DG2-wide stepping + * number. + */ +#define IS_DG2_GT_STEP(__i915, variant, since, until) \ + (IS_SUBPLATFORM(__i915, INTEL_DG2, INTEL_SUBPLATFORM_##variant) && \ +IS_GT_STEP(__i915, since, until)) + +#define IS_DG2_DISP_STEP(__i915, since, until) \ + (IS_DG2(__i915) && \ +IS_DISPLAY_STEP(__i915, since, until)) + #define IS_LP(dev_priv)(INTEL_INFO(dev_priv)->is_lp) #define IS_GEN9_LP(dev_priv) (GRAPHICS_VER(dev_priv) == 9 && IS_LP(dev_priv)) #define IS_GEN9_BC(dev_priv) (GRAPHICS_VER(dev_priv) == 9 && !IS_LP(dev_priv)) diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c index 046309e95f43..a41e1792c0ff 100644 --- a/drivers/gpu/drm/i915/i915_pci.c +++ b/drivers/gpu/drm/i915/i915_pci.c @@ -1040,6 +1040,22 @@ static const struct intel_device_info xehpsdv_info = { .require_force_probe = 1, }; +__maybe_unused +static const struct intel_device_info dg2_info = { + XE_HP_FEATURES, + XE_HPM_FEATURES, + XE_LPD_FEATURES, + DGFX_FEATURES, + .graphics_ver_release = 55, + .media_ver_release = 55, + PLATFORM(INTEL_DG2), + .platform_engine_mask = + BIT(RCS0) | BIT(BCS0) | + BIT(VECS0) | BIT(VECS1) | + BIT(VCS0) | BIT(VCS2), + .require_force_probe = 1, +}; + #undef PLATFORM /* diff --git a/drivers/gpu/drm/i915/intel_device_info.c b/drivers/gpu/drm/i915/intel_device_info.c index 7b37b68f4548..41205dc356b7 100644 --- a/drivers/gpu/drm/i915/intel_device_info.c +++ b/drivers/gpu/drm/i915/intel_device_info.c @@ -69,6 +69,7 @@ static const char * const platform_names[] = { PLATFORM_NAME(ALDERLAKE_S), PLATFORM_NAME(ALDERLAKE_P), PLATFORM_NAME(XEHPSDV), + PLATFORM_NAME(DG2), }; #undef PLATFORM_NAME diff --git a/drivers/gpu/drm/i915/intel_device_info.h b/drivers/gpu/drm/i915/intel_device_info.h index e8684199b0c9..856f0aa5d68f 100644 --- a/drivers/gpu/drm/i915/intel_device_info.h +++ b/drivers/gpu/drm/i915/intel_device_info.h @@ -89,6 +89,7 @@ enum intel_platform { INTEL_ALDERLAKE_S, INTEL_ALDERLAKE_P, INTEL_XEHPSDV, + INTEL_DG2, INTEL_MAX_PLATFORMS }; @@ -107,6 +108,10 @@ enum intel_platform { /* CNL/ICL */ #define INTEL_SUBPLATFORM_PORTF(0) +/* DG2 */ +#define INTEL_SUBPLATFORM_G10 0 +#define INTEL_SUBPLATFORM_G11
[PATCH 01/53] drm/i915: Add "release id" version
From: Lucas De Marchi Besides the arch version returned by GRAPHICS_VER(), new platforms contain a "release id" to make clear the difference from one platform to another. Although for the first ones we may use them as if they were a major/minor version, that is not true for all platforms: we may have a `release_id == n` that is closer to `n - 2` than to `n - 1`. However the release id number is not defined by hardware until we start using the GMD_ID register. For the platforms before that register is useful we will set the values in software and we can set them as we please. So the plan is to set them so we can group different features under a single GRAPHICS_VER_FULL() check. After GMD_ID is used, the usefulness of a "full version check" will be greatly reduced and will be mostly used for deciding workarounds and a few code paths. So it makes sense to keep it as a separate field from graphics_ver. Also, currently there is not much use for the release id in media and display, so keep them out. This is a mix of 2 independent changes: one by me and the other by Matt Roper. Cc: Matt Roper Signed-off-by: Lucas De Marchi Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/i915_drv.h | 6 ++ drivers/gpu/drm/i915/intel_device_info.c | 2 ++ drivers/gpu/drm/i915/intel_device_info.h | 2 ++ 3 files changed, 10 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 6dff4ca01241..9639800485b9 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1258,11 +1258,17 @@ static inline struct drm_i915_private *pdev_to_i915(struct pci_dev *pdev) */ #define IS_GEN(dev_priv, n)(GRAPHICS_VER(dev_priv) == (n)) +#define IP_VER(ver, release) ((ver) << 8 | (release)) + #define GRAPHICS_VER(i915) (INTEL_INFO(i915)->graphics_ver) +#define GRAPHICS_VER_FULL(i915) IP_VER(INTEL_INFO(i915)->graphics_ver, \ + INTEL_INFO(i915)->graphics_ver_release) #define IS_GRAPHICS_VER(i915, from, until) \ (GRAPHICS_VER(i915) >= (from) && GRAPHICS_VER(i915) <= (until)) #define MEDIA_VER(i915)(INTEL_INFO(i915)->media_ver) +#define MEDIA_VER_FULL(i915) IP_VER(INTEL_INFO(i915)->media_ver, \ + INTEL_INFO(i915)->media_ver_release) #define IS_MEDIA_VER(i915, from, until) \ (MEDIA_VER(i915) >= (from) && MEDIA_VER(i915) <= (until)) diff --git a/drivers/gpu/drm/i915/intel_device_info.c b/drivers/gpu/drm/i915/intel_device_info.c index 7eaa92fee421..e8ad14f002c1 100644 --- a/drivers/gpu/drm/i915/intel_device_info.c +++ b/drivers/gpu/drm/i915/intel_device_info.c @@ -97,7 +97,9 @@ void intel_device_info_print_static(const struct intel_device_info *info, struct drm_printer *p) { drm_printf(p, "graphics_ver: %u\n", info->graphics_ver); + drm_printf(p, "graphics_ver_release: %u\n", info->graphics_ver_release); drm_printf(p, "media_ver: %u\n", info->media_ver); + drm_printf(p, "media_ver_release: %u\n", info->media_ver_release); drm_printf(p, "display_ver: %u\n", info->display.ver); drm_printf(p, "gt: %d\n", info->gt); drm_printf(p, "iommu: %s\n", iommu_name()); diff --git a/drivers/gpu/drm/i915/intel_device_info.h b/drivers/gpu/drm/i915/intel_device_info.h index b326aff65cd6..944a5ff4df49 100644 --- a/drivers/gpu/drm/i915/intel_device_info.h +++ b/drivers/gpu/drm/i915/intel_device_info.h @@ -162,7 +162,9 @@ enum intel_ppgtt_type { struct intel_device_info { u8 graphics_ver; + u8 graphics_ver_release; u8 media_ver; + u8 media_ver_release; u8 gt; /* GT number, 0 if undefined */ intel_engine_mask_t platform_engine_mask; /* Engines supported by the HW */ -- 2.25.4
[PATCH 05/53] drm/i915/gen12: Use fuse info to enable SFC
From: Venkata Sandeep Dhanalakota In Gen12 there are various fuse combinations and in each configuration vdbox engine may be connected to SFC depending on which engines are available, so we need to set the SFC capability based on fuse value from the hardware. Even numbered phyical instance always have SFC, odd numbered physical instances have SFC only if previous even instance is fused off. Bspec: 48028 Cc: Tvrtko Ursulin Cc: Daniele Ceraolo Spurio Signed-off-by: Venkata Sandeep Dhanalakota Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/gt/intel_engine_cs.c | 30 ++- 1 file changed, 24 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c index 151870d8fdd3..4ab2c9abb943 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -442,6 +442,28 @@ void intel_engines_free(struct intel_gt *gt) } } +static inline +bool vdbox_has_sfc(struct drm_i915_private *i915, unsigned int physical_vdbox, + unsigned int logical_vdbox, u16 vdbox_mask) +{ + /* +* In Gen11, only even numbered logical VDBOXes are hooked +* up to an SFC (Scaler & Format Converter) unit. +* In Gen12, Even numbered phyical instance always are connected +* to an SFC. Odd numbered physical instances have SFC only if +* previous even instance is fused off. +*/ + if (GRAPHICS_VER(i915) == 12) { + return (physical_vdbox % 2 == 0) || + !(BIT(physical_vdbox - 1) & vdbox_mask); + } else if (GRAPHICS_VER(i915) == 11) { + return logical_vdbox % 2 == 0; + } + + MISSING_CASE(GRAPHICS_VER(i915)); + return false; +} + /* * Determine which engines are fused off in our particular hardware. * Note that we have a catch-22 situation where we need to be able to access @@ -493,13 +515,9 @@ static intel_engine_mask_t init_engine_mask(struct intel_gt *gt) continue; } - /* -* In Gen11, only even numbered logical VDBOXes are -* hooked up to an SFC (Scaler & Format Converter) unit. -* In TGL each VDBOX has access to an SFC. -*/ - if (GRAPHICS_VER(i915) >= 12 || logical_vdbox++ % 2 == 0) + if (vdbox_has_sfc(i915, i, logical_vdbox, vdbox_mask)) gt->info.vdbox_sfc_access |= BIT(i); + logical_vdbox++; } drm_dbg(&i915->drm, "vdbox enable: %04x, instances: %04lx\n", vdbox_mask, VDBOX_MASK(gt)); -- 2.25.4
[PATCH 02/53] drm/i915: Add XE_HP initial definitions
From: Lucas De Marchi Our _FEATURES macro went back to GEN7, extending each other, making it difficult to grasp what was really enabled/disabled. Take the opportunity of the GEN -> XE_HP name break and also break with the feature inheritance. For XE_HP this basically goes from GEN12 back to GEN7 coalescing the features making sure the overrides remain, remove all the display-specific features and sort it. Then also remove the definitions that would be overridden by DGFX_FEATURES and those that were 0 (since that is the default). Exception here is has_master_unit_irq: although it is a feature that started with DG1 and is true for all DGFX platforms, it's also true for XE_HP in general. Signed-off-by: Lucas De Marchi Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/i915_pci.c | 24 1 file changed, 24 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c index a7bfdd827bc8..dc0883bad9cf 100644 --- a/drivers/gpu/drm/i915/i915_pci.c +++ b/drivers/gpu/drm/i915/i915_pci.c @@ -995,6 +995,30 @@ static const struct intel_device_info adl_p_info = { }; #undef GEN + +#define XE_HP_PAGE_SIZES \ + .page_sizes = I915_GTT_PAGE_SIZE_4K | \ + I915_GTT_PAGE_SIZE_64K | \ + I915_GTT_PAGE_SIZE_2M + +#define XE_HP_FEATURES \ + .graphics_ver = 12, \ + .graphics_ver_release = 50, \ + XE_HP_PAGE_SIZES, \ + .dma_mask_size = 46, \ + .has_64bit_reloc = 1, \ + .has_global_mocs = 1, \ + .has_gt_uc = 1, \ + .has_llc = 1, \ + .has_logical_ring_contexts = 1, \ + .has_logical_ring_elsq = 1, \ + .has_rc6 = 1, \ + .has_reset_engine = 1, \ + .has_rps = 1, \ + .has_runtime_pm = 1, \ + .ppgtt_size = 48, \ + .ppgtt_type = INTEL_PPGTT_FULL + #undef PLATFORM /* -- 2.25.4
[PATCH 08/53] drm/i915/xehp: Extra media engines - Part 2 (interrupts)
From: John Harrison Xe_HP can have a lot of extra media engines. This patch adds the interrupt handler support for them. Cc: Tvrtko Ursulin Cc: Daniele Ceraolo Spurio Signed-off-by: John Harrison Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/gt/intel_gt_irq.c | 13 - drivers/gpu/drm/i915/i915_reg.h| 3 +++ 2 files changed, 15 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gt/intel_gt_irq.c b/drivers/gpu/drm/i915/gt/intel_gt_irq.c index c13462274fe8..b2de83be4d97 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_irq.c +++ b/drivers/gpu/drm/i915/gt/intel_gt_irq.c @@ -184,7 +184,13 @@ void gen11_gt_irq_reset(struct intel_gt *gt) intel_uncore_write(uncore, GEN11_BCS_RSVD_INTR_MASK,~0); intel_uncore_write(uncore, GEN11_VCS0_VCS1_INTR_MASK, ~0); intel_uncore_write(uncore, GEN11_VCS2_VCS3_INTR_MASK, ~0); + if (HAS_ENGINE(gt, VCS4) || HAS_ENGINE(gt, VCS5)) + intel_uncore_write(uncore, GEN12_VCS4_VCS5_INTR_MASK, ~0); + if (HAS_ENGINE(gt, VCS6) || HAS_ENGINE(gt, VCS7)) + intel_uncore_write(uncore, GEN12_VCS6_VCS7_INTR_MASK, ~0); intel_uncore_write(uncore, GEN11_VECS0_VECS1_INTR_MASK, ~0); + if (HAS_ENGINE(gt, VECS2) || HAS_ENGINE(gt, VECS3)) + intel_uncore_write(uncore, GEN12_VECS2_VECS3_INTR_MASK, ~0); intel_uncore_write(uncore, GEN11_GPM_WGBOXPERF_INTR_ENABLE, 0); intel_uncore_write(uncore, GEN11_GPM_WGBOXPERF_INTR_MASK, ~0); @@ -218,8 +224,13 @@ void gen11_gt_irq_postinstall(struct intel_gt *gt) intel_uncore_write(uncore, GEN11_BCS_RSVD_INTR_MASK, ~smask); intel_uncore_write(uncore, GEN11_VCS0_VCS1_INTR_MASK, ~dmask); intel_uncore_write(uncore, GEN11_VCS2_VCS3_INTR_MASK, ~dmask); + if (HAS_ENGINE(gt, VCS4) || HAS_ENGINE(gt, VCS5)) + intel_uncore_write(uncore, GEN12_VCS4_VCS5_INTR_MASK, ~dmask); + if (HAS_ENGINE(gt, VCS6) || HAS_ENGINE(gt, VCS7)) + intel_uncore_write(uncore, GEN12_VCS6_VCS7_INTR_MASK, ~dmask); intel_uncore_write(uncore, GEN11_VECS0_VECS1_INTR_MASK, ~dmask); - + if (HAS_ENGINE(gt, VECS2) || HAS_ENGINE(gt, VECS3)) + intel_uncore_write(uncore, GEN12_VECS2_VECS3_INTR_MASK, ~dmask); /* * RPS interrupts will get enabled/disabled on demand when RPS itself * is enabled/disabled. diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index d4546e871833..cb1716b6ce72 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -8076,7 +8076,10 @@ enum { #define GEN11_BCS_RSVD_INTR_MASK _MMIO(0x1900a0) #define GEN11_VCS0_VCS1_INTR_MASK _MMIO(0x1900a8) #define GEN11_VCS2_VCS3_INTR_MASK _MMIO(0x1900ac) +#define GEN12_VCS4_VCS5_INTR_MASK _MMIO(0x1900b0) +#define GEN12_VCS6_VCS7_INTR_MASK _MMIO(0x1900b4) #define GEN11_VECS0_VECS1_INTR_MASK_MMIO(0x1900d0) +#define GEN12_VECS2_VECS3_INTR_MASK_MMIO(0x1900d4) #define GEN11_GUC_SG_INTR_MASK _MMIO(0x1900e8) #define GEN11_GPM_WGBOXPERF_INTR_MASK _MMIO(0x1900ec) #define GEN11_CRYPTO_RSVD_INTR_MASK_MMIO(0x1900f0) -- 2.25.4
[PATCH 10/53] drm/i915/xehp: Xe_HP forcewake support
Implement Xe_HP forcewake handling. While we're at it, let's reorder to the forcewake assignment if/else ladder to match our usual driver conventions. Co-authored-by: Daniele Ceraolo Spurio Signed-off-by: Daniele Ceraolo Spurio Signed-off-by: Stuart Summers Signed-off-by: Matt Roper --- .../drm/i915/gt/intel_execlists_submission.c | 4 + drivers/gpu/drm/i915/intel_uncore.c | 336 +++--- drivers/gpu/drm/i915/intel_uncore.h | 14 +- drivers/gpu/drm/i915/selftests/intel_uncore.c | 2 + 4 files changed, 302 insertions(+), 54 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c index cdb2126a159a..15ba0d83151a 100644 --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c @@ -3318,6 +3318,10 @@ int intel_execlists_submission_setup(struct intel_engine_cs *engine) i915_mmio_reg_offset(RING_EXECLIST_SQ_CONTENTS(base)); execlists->ctrl_reg = uncore->regs + i915_mmio_reg_offset(RING_EXECLIST_CONTROL(base)); + + engine->fw_domain = intel_uncore_forcewake_for_reg(engine->uncore, + RING_EXECLIST_CONTROL(engine->mmio_base), + FW_REG_WRITE); } else { execlists->submit_reg = uncore->regs + i915_mmio_reg_offset(RING_ELSP(base)); diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c index d067524f9162..676b0052f01e 100644 --- a/drivers/gpu/drm/i915/intel_uncore.c +++ b/drivers/gpu/drm/i915/intel_uncore.c @@ -24,6 +24,8 @@ #include #include +#include "gt/intel_lrc_reg.h" /* for shadow reg list */ + #include "i915_drv.h" #include "i915_trace.h" #include "i915_vgpu.h" @@ -68,8 +70,14 @@ static const char * const forcewake_domain_names[] = { "vdbox1", "vdbox2", "vdbox3", + "vdbox4", + "vdbox5", + "vdbox6", + "vdbox7", "vebox0", "vebox1", + "vebox2", + "vebox3", }; const char * @@ -952,30 +960,80 @@ static const i915_reg_t gen8_shadowed_regs[] = { }; static const i915_reg_t gen11_shadowed_regs[] = { - RING_TAIL(RENDER_RING_BASE),/* 0x2000 (base) */ - GEN6_RPNSWREQ, /* 0xA008 */ - GEN6_RC_VIDEO_FREQ, /* 0xA00C */ - RING_TAIL(BLT_RING_BASE), /* 0x22000 (base) */ - RING_TAIL(GEN11_BSD_RING_BASE), /* 0x1C (base) */ - RING_TAIL(GEN11_BSD2_RING_BASE),/* 0x1C4000 (base) */ - RING_TAIL(GEN11_VEBOX_RING_BASE), /* 0x1C8000 (base) */ - RING_TAIL(GEN11_BSD3_RING_BASE),/* 0x1D (base) */ - RING_TAIL(GEN11_BSD4_RING_BASE),/* 0x1D4000 (base) */ - RING_TAIL(GEN11_VEBOX2_RING_BASE), /* 0x1D8000 (base) */ + RING_TAIL(RENDER_RING_BASE),/* 0x2000 (base) */ + RING_EXECLIST_CONTROL(RENDER_RING_BASE),/* 0x2550 */ + GEN6_RPNSWREQ, /* 0xA008 */ + GEN6_RC_VIDEO_FREQ, /* 0xA00C */ + RING_TAIL(BLT_RING_BASE), /* 0x22000 (base) */ + RING_EXECLIST_CONTROL(BLT_RING_BASE), /* 0x22550 */ + RING_TAIL(GEN11_BSD_RING_BASE), /* 0x1C (base) */ + RING_EXECLIST_CONTROL(GEN11_BSD_RING_BASE), /* 0x1C0550 */ + RING_TAIL(GEN11_BSD2_RING_BASE),/* 0x1C4000 (base) */ + RING_EXECLIST_CONTROL(GEN11_BSD2_RING_BASE),/* 0x1C4550 */ + RING_TAIL(GEN11_VEBOX_RING_BASE), /* 0x1C8000 (base) */ + RING_EXECLIST_CONTROL(GEN11_VEBOX_RING_BASE), /* 0x1C8550 */ + RING_TAIL(GEN11_BSD3_RING_BASE),/* 0x1D (base) */ + RING_EXECLIST_CONTROL(GEN11_BSD3_RING_BASE),/* 0x1D0550 */ + RING_TAIL(GEN11_BSD4_RING_BASE),/* 0x1D4000 (base) */ + RING_EXECLIST_CONTROL(GEN11_BSD4_RING_BASE),/* 0x1D4550 */ + RING_TAIL(GEN11_VEBOX2_RING_BASE), /* 0x1D8000 (base) */ + RING_EXECLIST_CONTROL(GEN11_VEBOX2_RING_BASE), /* 0x1D8550 */ /* TODO: Other registers are not yet used */ }; static const i915_reg_t gen12_shadowed_regs[] = { - RING_TAIL(RENDER_RING_BASE),/* 0x2000 (base) */ - GEN6_RPNSWREQ, /* 0xA008 */ - GEN6_RC_VIDEO_FREQ, /* 0xA00C */ - RING_TAIL(BLT_RING_BASE), /* 0x22000 (base) */ - RING_TAIL(GEN11_BSD_RING_BASE), /* 0x1C (base) */ - RING_TAIL(GEN11_BSD2_RING_BASE),/* 0x1C4000 (base) */ - RING_TAIL(GEN11_VEBOX_RING_BASE), /* 0x1C8000 (base) */ - RING_TAIL(GEN11_BSD3_RING_BASE),/* 0x1D (base) */ - RING_TAIL(GEN11_BS
[PATCH 11/53] drm/i915/xehp: Define multicast register ranges
Since we can't steer multicast register reads during ring-based workaround verification, we need to define the multicast ranges where failure to steer could potentially cause us to read back from a fused-off register instance. As with gen12, we can ignore the multicast ranges that the bspec describes as 'SQIDI' since all instances of those registers will always be present and we'll always be able to read back a workaround value that was written with multicast. Bspec: 66534 Cc: José Roberto de Souza Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/gt/intel_workarounds.c | 20 +++- 1 file changed, 19 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c index d9a5a445ceec..20c6ca28e407 100644 --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c @@ -2089,12 +2089,30 @@ static const struct mcr_range mcr_ranges_gen12[] = { {}, }; +static const struct mcr_range mcr_ranges_xehp[] = { + { .start = 0x4000, .end = 0x4aff }, + { .start = 0x5200, .end = 0x52ff }, + { .start = 0x5400, .end = 0x7fff }, + { .start = 0x8140, .end = 0x815f }, + { .start = 0x8c80, .end = 0x8dff }, + { .start = 0x94d0, .end = 0x955f }, + { .start = 0x9680, .end = 0x96ff }, + { .start = 0xb000, .end = 0xb3ff }, + { .start = 0xc800, .end = 0xcfff }, + { .start = 0xd800, .end = 0xd8ff }, + { .start = 0xdc00, .end = 0x }, + { .start = 0x17000, .end = 0x17fff }, + { .start = 0x24a00, .end = 0x24a7f }, +}; + static bool mcr_range(struct drm_i915_private *i915, u32 offset) { const struct mcr_range *mcr_ranges; int i; - if (GRAPHICS_VER(i915) >= 12) + if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) + mcr_ranges = mcr_ranges_xehp; + else if (GRAPHICS_VER(i915) >= 12) mcr_ranges = mcr_ranges_gen12; else if (GRAPHICS_VER(i915) >= 8) mcr_ranges = mcr_ranges_gen8; -- 2.25.4
[PATCH 00/53] Begin enabling Xe_HP SDV and DG2 platforms
This series provides some of the initial enablement patches for two upcoming discrete GPUs: * XeHP SDV: Xe_HP (version 12.50) graphics IP, no display IP * DG2: Xe_HPG (version 12.55) graphics IP, Xe_LPD (version 13) display IP Both platforms will need additional enablement patches beyond what's present in this series before they're truly usable, including various LMEM and GuC work that's already happening separately. The new features/functionality that these platforms bring (such as multi-tile support, dedicated compute engines, etc.) may be referenced in passing in some of these patches but will be fully enabled in future series. Cc: Rodrigo Vivi Cc: Lucas De Marchi Cc: James Ausmus Akeem G Abodunrin (1): drm/i915/dg2: Add new LRI reg offsets Animesh Manna (1): drm/i915/dg2: Update to bigjoiner path Ankit Nautiyal (1): drm/i915/dg2: Configure PCON in DP pre-enable path Anusha Srivatsa (2): drm/i915/display/dsc: Add Per connector debugfs node for DSC BPP enable drm/i915/display/dsc: Set BPP in the kernel Daniele Ceraolo Spurio (1): drm/i915/xehp: handle new steering options Gwan-gyeong Mun (1): drm/i915/dg2: Update lane disable power state during PSR John Harrison (4): drm/i915/selftests: Allow for larger engine counts drm/i915/xehp: Extra media engines - Part 1 (engine definitions) drm/i915/xehp: Extra media engines - Part 2 (interrupts) drm/i915/xehp: Extra media engines - Part 3 (reset) José Roberto de Souza (1): drm/i915/dg2: Add DG2 to the PSR2 defeature list Lucas De Marchi (5): drm/i915: Add "release id" version drm/i915: Add XE_HP initial definitions drm/i915/xehpsdv: add initial XeHP SDV definitions drm/i915/xehpsdv: Define MOCS table for XeHP SDV drm/i915/xehpsdv: factor out function to read RP_STATE_CAP Matt Roper (29): drm/i915/xehp: Xe_HP forcewake support drm/i915/xehp: Define multicast register ranges drm/i915/xehp: Loop over all gslices for INSTDONE processing drm/i915/xehpsdv: Add maximum sseu limits drm/i915/xehpsdv: Define steering tables drm/i915/xehpsdv: Read correct RP_STATE_CAP register drm/i915/dg2: add DG2 platform info drm/i915/dg2: DG2 uses the same sseu limits as XeHP SDV drm/i915/dg2: Add forcewake table drm/i915/dg2: Update LNCF steering ranges drm/i915/dg2: Add SQIDI steering drm/i915/dg2: Maintain backward-compatible nested batch behavior drm/i915/dg2: Report INSTDONE_GEOM values in error state drm/i915/dg2: Define MOCS table for DG2 drm/i915/dg2: Add fake PCH drm/i915/dg2: Add cdclk table and reference clock drm/i915/dg2: Skip shared DPLL handling drm/i915/dg2: Don't wait for AUX power well enable ACKs drm/i915/dg2: Setup display outputs drm/i915/dg2: Add dbuf programming drm/i915/dg2: Don't program BW_BUDDY registers drm/i915/dg2: Don't read DRAM info drm/i915/dg2: DG2 has fixed memory bandwidth drm/i915/dg2: Add MPLLB programming for SNPS PHY drm/i915/dg2: Add MPLLB programming for HDMI drm/i915/dg2: Add vswing programming for SNPS phys drm/i915/dg2: Update modeset sequences drm/i915/dg2: Classify DG2 PHY types drm/i915/dg2: Wait for SNPS PHY calibration during display init Matthew Auld (1): drm/i915/xehp: Changes to ss/eu definitions Paulo Zanoni (1): drm/i915: Fork DG1 interrupt handler Prathap Kumar Valsan (1): drm/i915/xehp: New engine context offsets Stuart Summers (2): drm/i915/xehp: Handle new device context ID format drm/i915/xehpsdv: Add compute DSS type Tvrtko Ursulin (1): drm/i915/xehp: VDBOX/VEBOX fusing registers are enable-based Venkata Sandeep Dhanalakota (1): drm/i915/gen12: Use fuse info to enable SFC drivers/gpu/drm/i915/Makefile | 1 + drivers/gpu/drm/i915/display/intel_bw.c | 24 +- drivers/gpu/drm/i915/display/intel_cdclk.c| 24 +- drivers/gpu/drm/i915/display/intel_ddi.c | 165 +++- drivers/gpu/drm/i915/display/intel_display.c | 94 +- drivers/gpu/drm/i915/display/intel_display.h | 1 + .../drm/i915/display/intel_display_debugfs.c | 103 ++- .../drm/i915/display/intel_display_power.c| 25 + .../drm/i915/display/intel_display_power.h| 10 + .../drm/i915/display/intel_display_types.h| 18 +- drivers/gpu/drm/i915/display/intel_dp.c | 23 +- drivers/gpu/drm/i915/display/intel_dpll.c | 12 +- drivers/gpu/drm/i915/display/intel_dpll_mgr.c | 5 +- drivers/gpu/drm/i915/display/intel_hdmi.c | 11 + drivers/gpu/drm/i915/display/intel_psr.c | 10 +- drivers/gpu/drm/i915/display/intel_snps_phy.c | 862 ++ drivers/gpu/drm/i915/display/intel_snps_phy.h | 35 + drivers/gpu/drm/i915/gt/debugfs_gt_pm.c | 8 +- drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 7 +- drivers/gpu/drm/i915/gt/intel_engine_cs.c | 144 ++- drivers/gpu/drm/i915/gt/intel_engine_types.h | 29 +- .../drm/i915/gt/intel_execlists_submission.c | 78 +- drivers/gpu/drm/i915/gt/intel_gt.c| 66 +- drivers/gpu/drm/i915/gt/intel_gt.h
Re: [git pull] drm for 5.14-rc1
The pull request you sent on Thu, 1 Jul 2021 14:34:15 +1000: > git://anongit.freedesktop.org/drm/drm tags/drm-next-2021-07-01 has been merged into torvalds/linux.git: https://git.kernel.org/torvalds/c/e058a84bfddc42ba356a2316f2cf1141974625c9 Thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/prtracker.html
[PATCH v1 2/2] drm/i915/gem: Migrate to system at dma-buf attach time
From: Thomas Hellström Until we support p2p dma or as a complement to that, migrate data to system memory at dma-buf attach time if possible. v2: - Rebase on dynamic exporter. Update the igt_dmabuf_import_same_driver selftest to migrate if we are LMEM capable. v3: - Migrate also in the pin() callback. v4: - Migrate in attach Signed-off-by: Thomas Hellström Signed-off-by: Michael J. Ruhl --- drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c | 12 +++- drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c | 4 +++- 2 files changed, 14 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c index ccae17d5f441..280291a4a9dc 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c @@ -170,9 +170,19 @@ static int i915_gem_dmabuf_attach(struct dma_buf *dmabuf, struct dma_buf_attachment *attach) { struct drm_i915_gem_object *obj = dma_buf_to_obj(dmabuf); + int ret; assert_object_held(obj); - return i915_gem_object_pin_pages(obj); + + if (!i915_gem_object_can_migrate(obj, INTEL_REGION_SMEM)) + return -EOPNOTSUPP; + ret = i915_gem_object_migrate(obj, NULL, INTEL_REGION_SMEM); + if (!ret) + ret = i915_gem_object_wait_migration(obj, 0); + if (!ret) + ret = i915_gem_object_pin_pages(obj); + + return ret; } static void i915_gem_dmabuf_detach(struct dma_buf *dmabuf, diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c index 868b3469ecbd..b1e87ec08741 100644 --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c @@ -106,7 +106,9 @@ static int igt_dmabuf_import_same_driver(void *arg) int err; force_different_devices = true; - obj = i915_gem_object_create_shmem(i915, PAGE_SIZE); + obj = i915_gem_object_create_lmem(i915, PAGE_SIZE, 0); + if (IS_ERR(obj)) + obj = i915_gem_object_create_shmem(i915, PAGE_SIZE); if (IS_ERR(obj)) goto out_ret; -- 2.31.1
[PATCH v1 1/2] drm/i915/gem: Correct the locking and pin pattern for dma-buf
From: Thomas Hellström If our exported dma-bufs are imported by another instance of our driver, that instance will typically have the imported dma-bufs locked during dma_buf_map_attachment(). But the exporter also locks the same reservation object in the map_dma_buf() callback, which leads to recursive locking. So taking the lock inside _pin_pages_unlocked() is incorrect. Additionally, the current pinning code path is contrary to the defined way that pinning should occur. Remove the explicit pin/unpin from the map/umap functions and move them to the attach/detach allowing correct locking to occur, and to match the static dma-buf drm_prime pattern. Add a live selftest to exercise both dynamic and non-dynamic exports. v2: - Extend the selftest with a fake dynamic importer. - Provide real pin and unpin callbacks to not abuse the interface. v3: (ruhl) - Remove the dynamic export support and move the pinning into the attach/detach path. v4: (ruhl) - Put pages does not need to assert on the dma-resv Reported-by: Michael J. Ruhl Signed-off-by: Thomas Hellström Signed-off-by: Michael J. Ruhl --- drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c| 44 +-- .../drm/i915/gem/selftests/i915_gem_dmabuf.c | 116 +- 2 files changed, 146 insertions(+), 14 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c index 616c3a2f1baf..ccae17d5f441 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c @@ -12,6 +12,8 @@ #include "i915_gem_object.h" #include "i915_scatterlist.h" +I915_SELFTEST_DECLARE(static bool force_different_devices;) + static struct drm_i915_gem_object *dma_buf_to_obj(struct dma_buf *buf) { return to_intel_bo(buf->priv); @@ -25,15 +27,11 @@ static struct sg_table *i915_gem_map_dma_buf(struct dma_buf_attachment *attachme struct scatterlist *src, *dst; int ret, i; - ret = i915_gem_object_pin_pages_unlocked(obj); - if (ret) - goto err; - /* Copy sg so that we make an independent mapping */ st = kmalloc(sizeof(struct sg_table), GFP_KERNEL); if (st == NULL) { ret = -ENOMEM; - goto err_unpin_pages; + goto err; } ret = sg_alloc_table(st, obj->mm.pages->nents, GFP_KERNEL); @@ -58,8 +56,6 @@ static struct sg_table *i915_gem_map_dma_buf(struct dma_buf_attachment *attachme sg_free_table(st); err_free: kfree(st); -err_unpin_pages: - i915_gem_object_unpin_pages(obj); err: return ERR_PTR(ret); } @@ -68,13 +64,9 @@ static void i915_gem_unmap_dma_buf(struct dma_buf_attachment *attachment, struct sg_table *sg, enum dma_data_direction dir) { - struct drm_i915_gem_object *obj = dma_buf_to_obj(attachment->dmabuf); - dma_unmap_sgtable(attachment->dev, sg, dir, DMA_ATTR_SKIP_CPU_SYNC); sg_free_table(sg); kfree(sg); - - i915_gem_object_unpin_pages(obj); } static int i915_gem_dmabuf_vmap(struct dma_buf *dma_buf, struct dma_buf_map *map) @@ -168,7 +160,32 @@ static int i915_gem_end_cpu_access(struct dma_buf *dma_buf, enum dma_data_direct return err; } +/** + * i915_gem_dmabuf_attach - Do any extra attach work necessary + * @dmabuf: imported dma-buf + * @attach: new attach to do work on + * + */ +static int i915_gem_dmabuf_attach(struct dma_buf *dmabuf, + struct dma_buf_attachment *attach) +{ + struct drm_i915_gem_object *obj = dma_buf_to_obj(dmabuf); + + assert_object_held(obj); + return i915_gem_object_pin_pages(obj); +} + +static void i915_gem_dmabuf_detach(struct dma_buf *dmabuf, + struct dma_buf_attachment *attach) +{ + struct drm_i915_gem_object *obj = dma_buf_to_obj(dmabuf); + + i915_gem_object_unpin_pages(obj); +} + static const struct dma_buf_ops i915_dmabuf_ops = { + .attach = i915_gem_dmabuf_attach, + .detach = i915_gem_dmabuf_detach, .map_dma_buf = i915_gem_map_dma_buf, .unmap_dma_buf = i915_gem_unmap_dma_buf, .release = drm_gem_dmabuf_release, @@ -204,6 +221,8 @@ static int i915_gem_object_get_pages_dmabuf(struct drm_i915_gem_object *obj) struct sg_table *pages; unsigned int sg_page_sizes; + assert_object_held(obj); + pages = dma_buf_map_attachment(obj->base.import_attach, DMA_BIDIRECTIONAL); if (IS_ERR(pages)) @@ -241,7 +260,8 @@ struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev, if (dma_buf->ops == &i915_dmabuf_ops) { obj = dma_buf_to_obj(dma_buf); /* is it from our device? */ - if (obj->base.dev == dev) { + if (obj->base.dev == dev && + !I915_SELFTEST_ONLY(force_differ
Re: [git pull] drm for 5.14-rc1
On Wed, Jun 30, 2021 at 9:34 PM Dave Airlie wrote: > > Hi Linus, > > This is the main drm pull request for 5.14-rc1. > > I've done a test pull into your current tree, and hit two conflicts > (one in vc4, one in amdgpu), both seem pretty trivial, the amdgpu one > is recent and sfr sent out a resolution for it today. Well, the resolutions may be trivial, but the conflict made me look at the code, and it's buggy. Commit 04d8d73dbcbe ("drm/amdgpu: add common HMM get pages function") is broken. It made the code do mmap_read_lock(mm); vma = find_vma(mm, start); mmap_read_unlock(mm); and then it *uses* that "vma" after it has dropped the lock. That's a big no-no - once you've dropped the lock, the vma contents simply aren't reliable any more. That mapping could now be unmapped and removed at any time. Now, the conflict actually made one of the uses go away (switching to vma_lookup() means that the subsequent code no longer needs to look at "vm_start" to verify we're actually _inside_ the vma), but it still checks for vma->vm_file afterwards. So those locking changes in commit 04d8d73dbcbe are completely bogus. I tried to fix up that bug while handling the conflict, but who knows what else similar is going on elsewhere. So I would ask people to (a) verify that I didn't make things worse as I fixed things up (note how I had to change the last argument to amdgpu_hmm_range_get_pages() from false to true etc). (b) go and look at their vma lookup code: you can't just look up a vma under the lock, and then drop the lock, and then think things stay stable. In particular for that (b) case: it is *NOT* enough to look up vma->vm_file inside the lock and cache that. No - if the test is about "no backing file before looking up pages", then you have to *keep* holding the lock until after you've actually looked up the pages! Because otherwise any test for "vma->vm_file" is entirely pointless, for the same reason it's buggy to even look at it after dropping the lock: because once you've dropped the lock, the thing you just tested for might not be true any more. So no, it's not valid to do bool has_file = vma && vma->vm_file; and then drop the lock, because you don't use 'vma' any more as a pointer, and then use 'has_file' outside the lock. Because after you've dropped the lock, 'has_file' is now meaningless. So it's not just about "you can't look at vma->vm_file after dropping the lock". It's more fundamental than that. Any *decision* you make based on the vma is entirely pointless and moot after the lock is dropped! Did I fix it up correctly? Who knows. The code makes more sense to me now and seems valid. But I really *really* want to stress how locking is important. You also can't just unlock in the middle of an operation - even if you then take the lock *again* later (as amdgpu_hmm_range_get_pages() then did), the fact that you unlocked in the middle means that all the earlier tests you did are simply no longer valid when you re-take the lock. Linus
Re: [Intel-gfx] [PULL] drm-intel-next-fixes
On Thu, Jul 01, 2021 at 11:57:53AM +0300, Jani Nikula wrote: > On Wed, 30 Jun 2021, Rodrigo Vivi wrote: > > On Wed, Jun 30, 2021 at 01:05:35PM +0300, Jani Nikula wrote: > >> On Tue, 29 Jun 2021, Rodrigo Vivi wrote: > >> > Hi Dave and Daniel, > >> > > >> > Here goes drm-intel-next-fixes-2021-06-29: > >> > > >> > The biggest fix is the restoration of mmap ioctl for gen12 integrated > >> > parts > >> > which lack was breaking ADL-P with media stack. > >> > Besides that a small selftest fix and a theoretical overflow on > >> > i915->pipe_to_crtc_mapping. > >> > >> My last fixes pull for v5.13 fell between the cracks [1]. There was one > >> stable worthy fix, but since it was still in drm-intel-fixes when you > >> ran dim cherry-pick-next-fixes, it was skipped for drm-intel-next-fixes. > >> > >> I've now dropped the commit and pushed v5.13 to drm-intel-fixes, as > >> we're past that point. Subsequent dim cherry-pick-next-fixes should pick > >> it up now. > > > > it didn't, probably because the Fixes hash not being part of the drm-next > > yet?! > > Odd, should be. indeed... > > > I can cherry-pick that directly. Please let me know the commit id. > > c88e2647c5bb ("drm/i915/display: Do not zero past infoframes.vsc") pushed to drm-intel-next-queue... will wait for CI results and send another PR. I hope there's still time, otherwise it can wait for the -fixes flow > > Thanks, > Jani. > > > > > > Thanks, > > Rodrigo. > > > >> > >> Please do another next fixes pull request with that. (It's okay to pull > >> this one already though, doesn't make a difference.) > >> > >> > >> BR, > >> Jani. > >> > >> > >> [1] https://lore.kernel.org/r/87czsbu15r@intel.com > >> > >> > >> > >> > > >> > Thanks, > >> > Rodrigo. > >> > > >> > The following changes since commit > >> > 1bd8a7dc28c1c410f1ceefae1f2a97c06d1a67c2: > >> > > >> > Merge tag 'exynos-drm-next-for-v5.14' of > >> > git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos into > >> > drm-next (2021-06-11 14:19:12 +1000) > >> > > >> > are available in the Git repository at: > >> > > >> > git://anongit.freedesktop.org/drm/drm-intel > >> > tags/drm-intel-next-fixes-2021-06-29 > >> > > >> > for you to fetch changes up to c90c4c6574f3feaf2203b5671db1907a1e15c653: > >> > > >> > drm/i915: Reinstate the mmap ioctl for some platforms (2021-06-28 > >> > 07:43:56 -0400) > >> > > >> > > >> > The biggest fix is the restoration of mmap ioctl for gen12 integrated > >> > parts > >> > which lack was breaking ADL-P with media stack. > >> > Besides that a small selftest fix and a theoretical overflow on > >> > i915->pipe_to_crtc_mapping. > >> > > >> > > >> > Chris Wilson (1): > >> > drm/i915/selftests: Reorder tasklet_disable vs local_bh_disable > >> > > >> > Jani Nikula (1): > >> > drm/i915/dsc: abstract helpers to get bigjoiner primary/secondary > >> > crtc > >> > > >> > Thomas Hellström (1): > >> > drm/i915: Reinstate the mmap ioctl for some platforms > >> > > >> > drivers/gpu/drm/i915/display/intel_display.c | 7 ++- > >> > drivers/gpu/drm/i915/display/intel_display_types.h | 8 > >> > drivers/gpu/drm/i915/display/intel_vdsc.c | 40 +++- > >> > drivers/gpu/drm/i915/display/intel_vdsc.h | 1 + > >> > drivers/gpu/drm/i915/gem/i915_gem_mman.c | 7 +-- > >> > drivers/gpu/drm/i915/gt/selftest_execlists.c | 55 > >> > +- > >> > 6 files changed, 76 insertions(+), 42 deletions(-) > >> > >> -- > >> Jani Nikula, Intel Open Source Graphics Center > > -- > Jani Nikula, Intel Open Source Graphics Center > ___ > Intel-gfx mailing list > intel-...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 47/47] drm/i915/guc: Unblock GuC submission on Gen11+
On Thu, Jul 1, 2021 at 8:27 PM Martin Peres wrote: > > On 01/07/2021 11:14, Pekka Paalanen wrote: > > On Wed, 30 Jun 2021 11:58:25 -0700 > > John Harrison wrote: > > > >> On 6/30/2021 01:22, Martin Peres wrote: > >>> On 24/06/2021 10:05, Matthew Brost wrote: > From: Daniele Ceraolo Spurio > > Unblock GuC submission on Gen11+ platforms. > > Signed-off-by: Michal Wajdeczko > Signed-off-by: Daniele Ceraolo Spurio > Signed-off-by: Matthew Brost > --- > drivers/gpu/drm/i915/gt/uc/intel_guc.h| 1 + > drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 8 > drivers/gpu/drm/i915/gt/uc/intel_guc_submission.h | 3 +-- > drivers/gpu/drm/i915/gt/uc/intel_uc.c | 14 +- > 4 files changed, 19 insertions(+), 7 deletions(-) > > > > > ... > > > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc.c > b/drivers/gpu/drm/i915/gt/uc/intel_uc.c > index 7a69c3c027e9..61be0aa81492 100644 > --- a/drivers/gpu/drm/i915/gt/uc/intel_uc.c > +++ b/drivers/gpu/drm/i915/gt/uc/intel_uc.c > @@ -34,8 +34,15 @@ static void uc_expand_default_options(struct > intel_uc *uc) > return; > } > -/* Default: enable HuC authentication only */ > -i915->params.enable_guc = ENABLE_GUC_LOAD_HUC; > +/* Intermediate platforms are HuC authentication only */ > +if (IS_DG1(i915) || IS_ALDERLAKE_S(i915)) { > +drm_dbg(&i915->drm, "Disabling GuC only due to old > platform\n"); > >>> > >>> This comment does not seem accurate, given that DG1 is barely out, and > >>> ADL is not out yet. How about: > >>> > >>> "Disabling GuC on untested platforms"? > >>> > >> Just because something is not in the shops yet does not mean it is new. > >> Technology is always obsolete by the time it goes on sale. > > > > That is a very good reason to not use terminology like "new", "old", > > "current", "modern" etc. at all. > > > > End users like me definitely do not share your interpretation of "old". > > Yep, old and new is relative. In the end, what matters is the validation > effort, which is why I was proposing "untested platforms". > > Also, remember that you are not writing these messages for Intel > engineers, but instead are writing for Linux *users*. It's drm_dbg. Users don't read this stuff, at least not users with no clue what the driver does and stuff like that. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [PATCH 47/47] drm/i915/guc: Unblock GuC submission on Gen11+
On 01/07/2021 11:14, Pekka Paalanen wrote: On Wed, 30 Jun 2021 11:58:25 -0700 John Harrison wrote: On 6/30/2021 01:22, Martin Peres wrote: On 24/06/2021 10:05, Matthew Brost wrote: From: Daniele Ceraolo Spurio Unblock GuC submission on Gen11+ platforms. Signed-off-by: Michal Wajdeczko Signed-off-by: Daniele Ceraolo Spurio Signed-off-by: Matthew Brost --- drivers/gpu/drm/i915/gt/uc/intel_guc.h | 1 + drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 8 drivers/gpu/drm/i915/gt/uc/intel_guc_submission.h | 3 +-- drivers/gpu/drm/i915/gt/uc/intel_uc.c | 14 +- 4 files changed, 19 insertions(+), 7 deletions(-) ... diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc.c b/drivers/gpu/drm/i915/gt/uc/intel_uc.c index 7a69c3c027e9..61be0aa81492 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_uc.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_uc.c @@ -34,8 +34,15 @@ static void uc_expand_default_options(struct intel_uc *uc) return; } - /* Default: enable HuC authentication only */ - i915->params.enable_guc = ENABLE_GUC_LOAD_HUC; + /* Intermediate platforms are HuC authentication only */ + if (IS_DG1(i915) || IS_ALDERLAKE_S(i915)) { + drm_dbg(&i915->drm, "Disabling GuC only due to old platform\n"); This comment does not seem accurate, given that DG1 is barely out, and ADL is not out yet. How about: "Disabling GuC on untested platforms"? Just because something is not in the shops yet does not mean it is new. Technology is always obsolete by the time it goes on sale. That is a very good reason to not use terminology like "new", "old", "current", "modern" etc. at all. End users like me definitely do not share your interpretation of "old". Yep, old and new is relative. In the end, what matters is the validation effort, which is why I was proposing "untested platforms". Also, remember that you are not writing these messages for Intel engineers, but instead are writing for Linux *users*. Cheers, Martin Thanks, pq And the issue is not a lack of testing, it is a question of whether we are allowed to change the default on something that has already started being used by customers or not (including pre-release beta customers). I.e. it is basically a political decision not an engineering decision.
Re: [PATCH 47/47] drm/i915/guc: Unblock GuC submission on Gen11+
On 30/06/2021 21:00, Matthew Brost wrote: On Wed, Jun 30, 2021 at 11:22:38AM +0300, Martin Peres wrote: On 24/06/2021 10:05, Matthew Brost wrote: From: Daniele Ceraolo Spurio Unblock GuC submission on Gen11+ platforms. Signed-off-by: Michal Wajdeczko Signed-off-by: Daniele Ceraolo Spurio Signed-off-by: Matthew Brost --- drivers/gpu/drm/i915/gt/uc/intel_guc.h| 1 + drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 8 drivers/gpu/drm/i915/gt/uc/intel_guc_submission.h | 3 +-- drivers/gpu/drm/i915/gt/uc/intel_uc.c | 14 +- 4 files changed, 19 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h b/drivers/gpu/drm/i915/gt/uc/intel_guc.h index fae01dc8e1b9..77981788204f 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h @@ -54,6 +54,7 @@ struct intel_guc { struct ida guc_ids; struct list_head guc_id_list; + bool submission_supported; bool submission_selected; struct i915_vma *ads_vma; diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c index a427336ce916..405339202280 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c @@ -2042,6 +2042,13 @@ void intel_guc_submission_disable(struct intel_guc *guc) /* Note: By the time we're here, GuC may have already been reset */ } +static bool __guc_submission_supported(struct intel_guc *guc) +{ + /* GuC submission is unavailable for pre-Gen11 */ + return intel_guc_is_supported(guc) && + INTEL_GEN(guc_to_gt(guc)->i915) >= 11; +} + static bool __guc_submission_selected(struct intel_guc *guc) { struct drm_i915_private *i915 = guc_to_gt(guc)->i915; @@ -2054,6 +2061,7 @@ static bool __guc_submission_selected(struct intel_guc *guc) void intel_guc_submission_init_early(struct intel_guc *guc) { + guc->submission_supported = __guc_submission_supported(guc); guc->submission_selected = __guc_submission_selected(guc); } diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.h b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.h index a2a3fad72be1..be767eb6ff71 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.h +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.h @@ -37,8 +37,7 @@ int intel_guc_wait_for_pending_msg(struct intel_guc *guc, static inline bool intel_guc_submission_is_supported(struct intel_guc *guc) { - /* XXX: GuC submission is unavailable for now */ - return false; + return guc->submission_supported; } static inline bool intel_guc_submission_is_wanted(struct intel_guc *guc) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_uc.c b/drivers/gpu/drm/i915/gt/uc/intel_uc.c index 7a69c3c027e9..61be0aa81492 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_uc.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_uc.c @@ -34,8 +34,15 @@ static void uc_expand_default_options(struct intel_uc *uc) return; } - /* Default: enable HuC authentication only */ - i915->params.enable_guc = ENABLE_GUC_LOAD_HUC; + /* Intermediate platforms are HuC authentication only */ + if (IS_DG1(i915) || IS_ALDERLAKE_S(i915)) { + drm_dbg(&i915->drm, "Disabling GuC only due to old platform\n"); This comment does not seem accurate, given that DG1 is barely out, and ADL is not out yet. How about: "Disabling GuC on untested platforms"? This isn't my comment but it seems right to me. AFAIK this describes the current PR but it is subject to change (i.e. we may enable GuC on DG1 by default at some point). Well, it's pretty bad PR to say that DG1 and ADL are old when they are not even out ;) But seriously, fix this sentence, it makes no sense at all unless you are really trying to confuse non-native speakers (and annoy language purists too). + i915->params.enable_guc = ENABLE_GUC_LOAD_HUC; + return; + } + + /* Default: enable HuC authentication and GuC submission */ + i915->params.enable_guc = ENABLE_GUC_LOAD_HUC | ENABLE_GUC_SUBMISSION; This seems to be in contradiction with the GuC submission plan which states: "Not enabled by default on any current platforms but can be enabled via modparam enable_guc". I don't believe any current platform gets this point where GuC submission would be enabled by default. The first would be ADL-P which isn't out yet. Isn't that exactly what the line above does? When you rework the patch, could you please add a warning when the user force-enables the GuC Command Submission? Something like: "WARNING: The user force-enabled the experimental GuC command submission backend using i915.enable_guc. Please disable it if experiencing stability issues. No bug reports will be accepted on this backend". This should allow you to work on the
Re: [PATCH 0/4] mgag200: Various cleanups
Hi Thomas, On Thu, Jul 01, 2021 at 02:43:12PM +0200, Thomas Zimmermann wrote: > Cleanup several nits in the driver's init code. Also move constant > data into the RO data segment. No functional changes. > > Tested on mgag200 HW. > > Thomas Zimmermann (4): > drm/mgag200: Don't pass flags to drm_dev_register() > drm/mgag200: Inline mgag200_device_init() This patch drop a redundant error message too - it had helped me if the changelog had said so but whatever. > drm/mgag200: Extract device type and flags in mgag200_pci_probe() > drm/mgag200: Constify LUT for programming bpp Full serie is: Acked-by: Sam Ravnborg Sam
[PATCH v5 2/2] drm/i915: Drop all references to DRM IRQ midlayer
Remove all references to DRM's IRQ midlayer. i915 uses Linux' interrupt functions directly. v2: * also remove an outdated comment * move IRQ fix into separate patch * update Fixes tag (Daniel) Signed-off-by: Thomas Zimmermann Fixes: b318b82455bd ("drm/i915: Nuke drm_driver irq vfuncs") Cc: Ville Syrjälä Cc: Chris Wilson Cc: Jani Nikula Cc: Joonas Lahtinen Cc: Rodrigo Vivi Cc: intel-...@lists.freedesktop.org --- drivers/gpu/drm/i915/i915_drv.c | 1 - drivers/gpu/drm/i915/i915_irq.c | 5 - 2 files changed, 6 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 62327c15f457..30d8cd8c69b1 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -42,7 +42,6 @@ #include #include #include -#include #include #include diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index 2203dca19895..1d4c683c9de9 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -33,7 +33,6 @@ #include #include -#include #include "display/intel_de.h" #include "display/intel_display_types.h" @@ -4564,10 +4563,6 @@ void intel_runtime_pm_enable_interrupts(struct drm_i915_private *dev_priv) bool intel_irqs_enabled(struct drm_i915_private *dev_priv) { - /* -* We only use drm_irq_uninstall() at unload and VT switch, so -* this is the only thing we need to check. -*/ return dev_priv->runtime_pm.irqs_enabled; } -- 2.32.0
[PATCH v5 0/2] drm/i915: IRQ fixes
Fix a bug in the usage of IRQs and cleanup references to the DRM IRQ midlayer. Preferably this patchset would be merged through drm-misc-next. v5: * go back to _hardirq() after CI tests reported atomic context in PCI probe; add rsp comment v4: * switch IRQ code to intel_synchronize_irq() (Daniel) v3: * also use intel_synchronize_hardirq() from other callsite v2: * split patch * also fix comment * add intel_synchronize_hardirq() (Ville) * update Fixes tag (Daniel) Thomas Zimmermann (2): drm/i915: Use the correct IRQ during resume drm/i915: Drop all references to DRM IRQ midlayer drivers/gpu/drm/i915/gt/intel_engine_cs.c | 2 +- drivers/gpu/drm/i915/gt/intel_ring_submission.c | 7 +-- drivers/gpu/drm/i915/i915_drv.c | 1 - drivers/gpu/drm/i915/i915_irq.c | 10 +- drivers/gpu/drm/i915/i915_irq.h | 1 + 5 files changed, 12 insertions(+), 9 deletions(-) base-commit: 67f5a18128770817e4218a9e496d2bf5047c51e8 prerequisite-patch-id: c2b2f08f0eccc9f5df0c0da49fa1d36267deb11d prerequisite-patch-id: c67e5d886a47b7d0266d81100837557fda34cb24 prerequisite-patch-id: 0cca17365e65370fa95d193ed2f1c88917ee1aef prerequisite-patch-id: 12b9894350a0b56579d29542943465ef5134751c prerequisite-patch-id: 3e1c37d3425f4820fe36ea3da57c65e166fe0ee5 prerequisite-patch-id: 1017c860a0bf95ce370d82b8db1745f5548fb321 prerequisite-patch-id: dcc022baab7c172978de9809702c2f4f54323047 prerequisite-patch-id: 0d05ee247042b43d5ab8f3af216e708a8e09bee8 prerequisite-patch-id: 110c411161bed6072c32185940fcd052d0bdb09a prerequisite-patch-id: d2d1aeccffdfadf2b951487b8605f59c795d84cf prerequisite-patch-id: 85fe31e27ca13adc0d1bcc7c19b1ce238a77ee6a prerequisite-patch-id: c61fdacbe035ba5c17f1ff393bc9087f16aaea7b prerequisite-patch-id: c4821af5dbba4d121769f1da85d91fbb53020ec0 prerequisite-patch-id: 0b20ef3302abfe6dc123dbc54b9dd087865f935b prerequisite-patch-id: d34eb96cbbdeb91870ace4250ea75920b1653dc2 prerequisite-patch-id: 7f64fce347d15232134d7636ca7a8d9f5bf1a3a0 prerequisite-patch-id: c83be7a285eb6682cdae0df401ab5d4c208f036b prerequisite-patch-id: eb1a44d2eb2685cea154dd3f17f5f463dfafd39a prerequisite-patch-id: 92a8c37dae4b8394fd6702f4af58ac7815ac3069 prerequisite-patch-id: f0237988fe4ae6eba143432d1ace8beb52d935f8 prerequisite-patch-id: bcf4d29437ed7cb78225dec4c99249eb40c18302 prerequisite-patch-id: 6407b4c7f1b80af8d329d5f796b30da11959e936 prerequisite-patch-id: 4a69e6e49d691b555f0e0874d638cd204dcb0c48 prerequisite-patch-id: be09cfa8a67dd435a25103b85bd4b1649c5190a3 prerequisite-patch-id: 813ecc9f94251c3d669155faf64c0c9e6a458393 prerequisite-patch-id: beb2b5000a1682cbd74a7e2ab1566fcae5bccbf0 prerequisite-patch-id: 754c8878611864475a0b75fd49ff38e71a21c795 prerequisite-patch-id: d7d4bac3c19f94ba9593143b3c147d83d82cb71f prerequisite-patch-id: 983d1efbe060743f5951e474961fa431d886d757 prerequisite-patch-id: 3c78b20c3b9315cd39e0ae9ea1510c6121bf9ca9 -- 2.32.0
[PATCH v5 1/2] drm/i915: Use the correct IRQ during resume
The code in xcs_resume() probably didn't work as intended. It uses struct drm_device.irq, which is allocated to 0, but never initialized by i915 to the device's interrupt number. Change all calls to synchronize_hardirq() to intel_synchronize_irq(), which uses the correct interrupt. _hardirq() functions are not needed in this context. v5: * go back to _hardirq() after PCI probe reported wrong context; add rsp comment v4: * switch everything to intel_synchronize_irq() (Daniel) v3: * also use intel_synchronize_hardirq() at another callsite v2: * wrap irq code in intel_synchronize_hardirq() (Ville) Signed-off-by: Thomas Zimmermann Fixes: 536f77b1caa0 ("drm/i915/gt: Call stop_ring() from ring resume, again") Cc: Chris Wilson Cc: Mika Kuoppala Cc: Daniel Vetter Cc: Rodrigo Vivi Cc: Joonas Lahtinen Cc: Maarten Lankhorst Cc: Lucas De Marchi --- drivers/gpu/drm/i915/gt/intel_engine_cs.c | 2 +- drivers/gpu/drm/i915/gt/intel_ring_submission.c | 7 +-- drivers/gpu/drm/i915/i915_irq.c | 5 + drivers/gpu/drm/i915/i915_irq.h | 1 + 4 files changed, 12 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c index 88694822716a..5ca3d1664335 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -1229,7 +1229,7 @@ bool intel_engine_is_idle(struct intel_engine_cs *engine) return true; /* Waiting to drain ELSP? */ - synchronize_hardirq(to_pci_dev(engine->i915->drm.dev)->irq); + intel_synchronize_hardirq(engine->i915); intel_engine_flush_submission(engine); /* ELSP is empty, but there are ready requests? E.g. after reset */ diff --git a/drivers/gpu/drm/i915/gt/intel_ring_submission.c b/drivers/gpu/drm/i915/gt/intel_ring_submission.c index 5d42a12ef3d6..5c4d204d07cc 100644 --- a/drivers/gpu/drm/i915/gt/intel_ring_submission.c +++ b/drivers/gpu/drm/i915/gt/intel_ring_submission.c @@ -184,8 +184,11 @@ static int xcs_resume(struct intel_engine_cs *engine) ENGINE_TRACE(engine, "ring:{HEAD:%04x, TAIL:%04x}\n", ring->head, ring->tail); - /* Double check the ring is empty & disabled before we resume */ - synchronize_hardirq(engine->i915->drm.irq); + /* +* Double check the ring is empty & disabled before we resume. Called +* from atomic context during PCI probe, so _hardirq(). +*/ + intel_synchronize_hardirq(engine->i915); if (!stop_ring(engine)) goto err; diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index 7d0ce8b9f8ed..2203dca19895 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -4575,3 +4575,8 @@ void intel_synchronize_irq(struct drm_i915_private *i915) { synchronize_irq(to_pci_dev(i915->drm.dev)->irq); } + +void intel_synchronize_hardirq(struct drm_i915_private *i915) +{ + synchronize_hardirq(to_pci_dev(i915->drm.dev)->irq); +} diff --git a/drivers/gpu/drm/i915/i915_irq.h b/drivers/gpu/drm/i915/i915_irq.h index db34d5dbe402..e43b6734f21b 100644 --- a/drivers/gpu/drm/i915/i915_irq.h +++ b/drivers/gpu/drm/i915/i915_irq.h @@ -94,6 +94,7 @@ void intel_runtime_pm_disable_interrupts(struct drm_i915_private *dev_priv); void intel_runtime_pm_enable_interrupts(struct drm_i915_private *dev_priv); bool intel_irqs_enabled(struct drm_i915_private *dev_priv); void intel_synchronize_irq(struct drm_i915_private *i915); +void intel_synchronize_hardirq(struct drm_i915_private *i915); int intel_get_crtc_scanline(struct intel_crtc *crtc); void gen8_irq_power_well_post_enable(struct drm_i915_private *dev_priv, -- 2.32.0
[PATCH v2 1/2] drm/gud: Free buffers on device removal
Free transfer and compression buffers on device removal instead of at DRM device removal time. This ensures that the usual 2x8MB buffers are released when the device is unplugged and not kept around should userspace keep the DRM device fd open. At least Ubuntu 20.04 doesn't release the DRM device on unplug. The damage_lock mutex is not destroyed because it is used outside the drm_dev_enter/exit block in gud_pipe_update(). AFAICT it's possible for an open fbdev descriptor to trigger a commit after the USB device is gone. v2: Don't destroy damage_lock Reviewed-by: Linus Walleij Signed-off-by: Noralf Trønnes --- drivers/gpu/drm/gud/gud_drv.c | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/gud/gud_drv.c b/drivers/gpu/drm/gud/gud_drv.c index e8b672dc9832..45427c73587f 100644 --- a/drivers/gpu/drm/gud/gud_drv.c +++ b/drivers/gpu/drm/gud/gud_drv.c @@ -394,14 +394,15 @@ static const struct drm_driver gud_drm_driver = { .minor = 0, }; -static void gud_free_buffers_and_mutex(struct drm_device *drm, void *unused) +static void gud_free_buffers_and_mutex(void *data) { - struct gud_device *gdrm = to_gud_device(drm); + struct gud_device *gdrm = data; vfree(gdrm->compress_buf); + gdrm->compress_buf = NULL; kfree(gdrm->bulk_buf); + gdrm->bulk_buf = NULL; mutex_destroy(&gdrm->ctrl_lock); - mutex_destroy(&gdrm->damage_lock); } static int gud_probe(struct usb_interface *intf, const struct usb_device_id *id) @@ -455,7 +456,7 @@ static int gud_probe(struct usb_interface *intf, const struct usb_device_id *id) INIT_WORK(&gdrm->work, gud_flush_work); gud_clear_damage(gdrm); - ret = drmm_add_action_or_reset(drm, gud_free_buffers_and_mutex, NULL); + ret = devm_add_action(dev, gud_free_buffers_and_mutex, gdrm); if (ret) return ret; -- 2.23.0
[PATCH v2 2/2] drm/gud: Use scatter-gather USB bulk transfer
There'a limit to how big a kmalloc buffer can be, and as memory gets fragmented it becomes more difficult to get big buffers. The downside of smaller buffers is that the driver has to split the transfer up which hampers performance. Compression might also take a hit because of the splitting. Solve this by allocating the transfer buffer using vmalloc and create a SG table to be passed on to the USB subsystem. vmalloc_32() is used to avoid DMA bounce buffers on USB controllers that can only access 32-bit addresses. This also solves the problem that split transfers can give host side tearing since flushing is decoupled from rendering. usb_sg_wait() doesn't have timeout handling builtin, so it is wrapped in a timer like 4 out of 6 users in the kernel have done. v2: - Use DIV_ROUND_UP (Linus) - Add timeout note to the commit log (Linus) - Expand note about upper buffer limit (Linus) - Change var name s/timer/ctx/ in gud_usb_bulk_timeout() Reviewed-by: Linus Walleij Signed-off-by: Noralf Trønnes --- drivers/gpu/drm/gud/gud_drv.c | 50 +- drivers/gpu/drm/gud/gud_internal.h | 2 ++ drivers/gpu/drm/gud/gud_pipe.c | 47 3 files changed, 78 insertions(+), 21 deletions(-) diff --git a/drivers/gpu/drm/gud/gud_drv.c b/drivers/gpu/drm/gud/gud_drv.c index 45427c73587f..b39a54f17063 100644 --- a/drivers/gpu/drm/gud/gud_drv.c +++ b/drivers/gpu/drm/gud/gud_drv.c @@ -394,13 +394,40 @@ static const struct drm_driver gud_drm_driver = { .minor = 0, }; +static int gud_alloc_bulk_buffer(struct gud_device *gdrm) +{ + unsigned int i, num_pages; + struct page **pages; + void *ptr; + int ret; + + gdrm->bulk_buf = vmalloc_32(gdrm->bulk_len); + if (!gdrm->bulk_buf) + return -ENOMEM; + + num_pages = DIV_ROUND_UP(gdrm->bulk_len, PAGE_SIZE); + pages = kmalloc_array(num_pages, sizeof(struct page *), GFP_KERNEL); + if (!pages) + return -ENOMEM; + + for (i = 0, ptr = gdrm->bulk_buf; i < num_pages; i++, ptr += PAGE_SIZE) + pages[i] = vmalloc_to_page(ptr); + + ret = sg_alloc_table_from_pages(&gdrm->bulk_sgt, pages, num_pages, + 0, gdrm->bulk_len, GFP_KERNEL); + kfree(pages); + + return ret; +} + static void gud_free_buffers_and_mutex(void *data) { struct gud_device *gdrm = data; vfree(gdrm->compress_buf); gdrm->compress_buf = NULL; - kfree(gdrm->bulk_buf); + sg_free_table(&gdrm->bulk_sgt); + vfree(gdrm->bulk_buf); gdrm->bulk_buf = NULL; mutex_destroy(&gdrm->ctrl_lock); } @@ -537,24 +564,17 @@ static int gud_probe(struct usb_interface *intf, const struct usb_device_id *id) if (desc.max_buffer_size) max_buffer_size = le32_to_cpu(desc.max_buffer_size); -retry: - /* -* Use plain kmalloc here since devm_kmalloc() places struct devres at the beginning -* of the buffer it allocates. This wastes a lot of memory when allocating big buffers. -* Asking for 2M would actually allocate 4M. This would also prevent getting the biggest -* possible buffer potentially leading to split transfers. -*/ - gdrm->bulk_buf = kmalloc(max_buffer_size, GFP_KERNEL | __GFP_NOWARN); - if (!gdrm->bulk_buf) { - max_buffer_size = roundup_pow_of_two(max_buffer_size) / 2; - if (max_buffer_size < SZ_512K) - return -ENOMEM; - goto retry; - } + /* Prevent a misbehaving device from allocating loads of RAM. 4096x4096@XRGB = 64 MB */ + if (max_buffer_size > SZ_64M) + max_buffer_size = SZ_64M; gdrm->bulk_pipe = usb_sndbulkpipe(interface_to_usbdev(intf), usb_endpoint_num(bulk_out)); gdrm->bulk_len = max_buffer_size; + ret = gud_alloc_bulk_buffer(gdrm); + if (ret) + return ret; + if (gdrm->compression & GUD_COMPRESSION_LZ4) { gdrm->lz4_comp_mem = devm_kmalloc(dev, LZ4_MEM_COMPRESS, GFP_KERNEL); if (!gdrm->lz4_comp_mem) diff --git a/drivers/gpu/drm/gud/gud_internal.h b/drivers/gpu/drm/gud/gud_internal.h index de2f2d2dbc60..1bb65a46c347 100644 --- a/drivers/gpu/drm/gud/gud_internal.h +++ b/drivers/gpu/drm/gud/gud_internal.h @@ -5,6 +5,7 @@ #include #include +#include #include #include #include @@ -26,6 +27,7 @@ struct gud_device { unsigned int bulk_pipe; void *bulk_buf; size_t bulk_len; + struct sg_table bulk_sgt; u8 compression; void *lz4_comp_mem; diff --git a/drivers/gpu/drm/gud/gud_pipe.c b/drivers/gpu/drm/gud/gud_pipe.c index 2f83ab6b8e61..e0fb6cc969a3 100644 --- a/drivers/gpu/drm/gud/gud_pipe.c +++ b/drivers/gpu/drm/gud/gud_pipe.c @@ -220,13 +220,51 @@ static int gud_prep_flush(struct gud_device *gdrm, struct drm_framebuffer *fb,
[PATCH] drm/fourcc: Add modifier definitions for Arm Fixed Rate Compression
Arm Fixed Rate Compression (AFRC) is a proprietary fixed rate image compression protocol and format. It is designed to provide guaranteed bandwidth and memory footprint reductions in graphics and media use-cases. This patch aims to add modifier definitions for describing AFRC. Signed-off-by: Normunds Rieksts --- include/uapi/drm/drm_fourcc.h | 109 +- 1 file changed, 106 insertions(+), 3 deletions(-) diff --git a/include/uapi/drm/drm_fourcc.h b/include/uapi/drm/drm_fourcc.h index f7156322aba5..9f4bb4a6f358 100644 --- a/include/uapi/drm/drm_fourcc.h +++ b/include/uapi/drm/drm_fourcc.h @@ -900,9 +900,9 @@ drm_fourcc_canonicalize_nvidia_format_mod(__u64 modifier) /* * The top 4 bits (out of the 56 bits alloted for specifying vendor specific - * modifiers) denote the category for modifiers. Currently we have only two - * categories of modifiers ie AFBC and MISC. We can have a maximum of sixteen - * different categories. + * modifiers) denote the category for modifiers. Currently we have three + * categories of modifiers ie AFBC, MISC and AFRC. We can have a maximum of + * sixteen different categories. */ #define DRM_FORMAT_MOD_ARM_CODE(__type, __val) \ fourcc_mod_code(ARM, ((__u64)(__type) << 52) | ((__val) & 0x000fULL)) @@ -1017,6 +1017,109 @@ drm_fourcc_canonicalize_nvidia_format_mod(__u64 modifier) */ #define AFBC_FORMAT_MOD_USM(1ULL << 12) +/* + * Arm Fixed-Rate Compression (AFRC) modifiers + * + * AFRC is a proprietary fixed rate image compression protocol and format, + * designed to provide guaranteed bandwidth and memory footprint + * reductions in graphics and media use-cases. + * + * AFRC buffers consist of one or more planes, with the same components + * and meaning as an uncompressed buffer using the same pixel format. + * + * Within each plane, the pixel/luma/chroma values are grouped into + * "coding unit" blocks which are individually compressed to a + * fixed size (in bytes). All coding units within a given plane of a buffer + * store the same number of values, and have the same compressed size. + * + * The coding unit size is configurable, allowing different rates of compression. + * + * The start of each AFRC buffer plane must be aligned to an alignment granule which + * depends on the coding unit size. + * + * Coding Unit Size Plane Alignment + * --- + * 16 bytes 1024 bytes + * 24 bytes 512 bytes + * 32 bytes 2048 bytes + * + * Coding units are grouped into paging tiles. AFRC buffer dimensions must be aligned + * to a multiple of the paging tile dimensions. + * The dimensions of each paging tile depend on whether the buffer is optimised for + * scanline (SCAN layout) or rotated (ROT layout) access. + * + * Layout Paging Tile Width Paging Tile Height + * -- - -- + * SCAN 16 coding units 4 coding units + * ROT 8 coding units 8 coding units + * + * The dimensions of each coding unit depend on the number of components + * in the compressed plane and whether the buffer is optimised for + * scanline (SCAN layout) or rotated (ROT layout) access. + * + * Number of Components in Plane Layout Coding Unit Width Coding Unit Height + * - - - -- + * 1 SCAN16 samples 4 samples + * Example: 16x4 luma samples in a 'Y' plane + * 16x4 chroma 'V' values, in the 'V' plane of a fully-planar YUV buffer + * - - - -- + * 1 ROT 8 samples 8 samples + * Example: 8x8 luma samples in a 'Y' plane + * 8x8 chroma 'V' values, in the 'V' plane of a fully-planar YUV buffer + * - - - -- + * 2 DONT CARE 8 samples 4 samples + * Example: 8x4 chroma pairs in the 'UV' plane of a semi-planar YUV buffer + * - - - -- + * 3 DONT CARE 4 samples 4 samples + * Example: 4x4 pixels in an RGB buffer without alpha + * - - - -- + * 4 DONT CARE 4 samples 4 samples + * Example: 4x4 pixels in an RGB buffer with alpha + */ + +#define DRM_FORMAT_MOD_ARM_TYPE_AFRC 0x02 + +#define DRM_FORMAT_MOD_ARM_AFRC(__afrc_mode) \ + DRM_FORMAT_MOD_ARM_CODE(DRM_FORMAT_MOD_ARM_TYPE_AFRC, __afrc_mode) + +/* + * AFRC coding unit size modifier. + * + * Indicates the number of bytes used to store each compressed coding unit for + * one or more planes in an AFRC encoded buffer. The coding unit size for chrominance + * is the same for both Cb and Cr
[PATCH 3/7] drm/i915/guc: Increase size of CTB buffers
With the introduction of non-blocking CTBs more than one CTB can be in flight at a time. Increasing the size of the CTBs should reduce how often software hits the case where no space is available in the CTB buffer. Cc: John Harrison Signed-off-by: Matthew Brost Reviewed-by: Michal Wajdeczko --- drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 11 --- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c index 80db59b45c45..43e03aa2dde8 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c @@ -58,11 +58,16 @@ static inline struct drm_device *ct_to_drm(struct intel_guc_ct *ct) * ++---+--+ * * Size of each `CT Buffer`_ must be multiple of 4K. - * As we don't expect too many messages, for now use minimum sizes. + * We don't expect too many messages in flight at any time, unless we are + * using the GuC submission. In that case each request requires a minimum + * 2 dwords which gives us a maximum 256 queue'd requests. Hopefully this + * enough space to avoid backpressure on the driver. We increase the size + * of the receive buffer (relative to the send) to ensure a G2H response + * CTB has a landing spot. */ #define CTB_DESC_SIZE ALIGN(sizeof(struct guc_ct_buffer_desc), SZ_2K) #define CTB_H2G_BUFFER_SIZE(SZ_4K) -#define CTB_G2H_BUFFER_SIZE(SZ_4K) +#define CTB_G2H_BUFFER_SIZE(4 * CTB_H2G_BUFFER_SIZE) struct ct_request { struct list_head link; @@ -643,7 +648,7 @@ static int ct_read(struct intel_guc_ct *ct, struct ct_incoming_msg **msg) /* beware of buffer wrap case */ if (unlikely(available < 0)) available += size; - CT_DEBUG(ct, "available %d (%u:%u)\n", available, head, tail); + CT_DEBUG(ct, "available %d (%u:%u:%u)\n", available, head, tail, size); GEM_BUG_ON(available < 0); header = cmds[head]; -- 2.28.0
[PATCH 4/7] drm/i915/guc: Add non blocking CTB send function
Add non blocking CTB send function, intel_guc_send_nb. GuC submission will send CTBs in the critical path and does not need to wait for these CTBs to complete before moving on, hence the need for this new function. The non-blocking CTB now must have a flow control mechanism to ensure the buffer isn't overrun. A lazy spin wait is used as we believe the flow control condition should be rare with a properly sized buffer. The function, intel_guc_send_nb, is exported in this patch but unused. Several patches later in the series make use of this function. v2: (Michal) - Use define for H2G room calculations - Move INTEL_GUC_SEND_NB define (Daniel Vetter) - Use msleep_interruptible rather than cond_resched v3: (Michal) - Move includes to following patch - s/INTEL_GUC_SEND_NB/INTEL_GUC_CT_SEND_NB/g Signed-off-by: John Harrison Signed-off-by: Matthew Brost --- .../gt/uc/abi/guc_communication_ctb_abi.h | 3 +- drivers/gpu/drm/i915/gt/uc/intel_guc.h| 11 ++- drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 87 +-- drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h | 4 +- 4 files changed, 91 insertions(+), 14 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h b/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h index e933ca02d0eb..99e1fad5ca20 100644 --- a/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h +++ b/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h @@ -79,7 +79,8 @@ static_assert(sizeof(struct guc_ct_buffer_desc) == 64); * +---+---+--+ */ -#define GUC_CTB_MSG_MIN_LEN1u +#define GUC_CTB_HDR_LEN1u +#define GUC_CTB_MSG_MIN_LENGUC_CTB_HDR_LEN #define GUC_CTB_MSG_MAX_LEN256u #define GUC_CTB_MSG_0_FENCE(0x << 16) #define GUC_CTB_MSG_0_FORMAT (0xf << 12) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h b/drivers/gpu/drm/i915/gt/uc/intel_guc.h index 4abc59f6f3cd..72e4653222e2 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h @@ -74,7 +74,14 @@ static inline struct intel_guc *log_to_guc(struct intel_guc_log *log) static inline int intel_guc_send(struct intel_guc *guc, const u32 *action, u32 len) { - return intel_guc_ct_send(&guc->ct, action, len, NULL, 0); + return intel_guc_ct_send(&guc->ct, action, len, NULL, 0, 0); +} + +static +inline int intel_guc_send_nb(struct intel_guc *guc, const u32 *action, u32 len) +{ + return intel_guc_ct_send(&guc->ct, action, len, NULL, 0, +INTEL_GUC_CT_SEND_NB); } static inline int @@ -82,7 +89,7 @@ intel_guc_send_and_receive(struct intel_guc *guc, const u32 *action, u32 len, u32 *response_buf, u32 response_buf_size) { return intel_guc_ct_send(&guc->ct, action, len, -response_buf, response_buf_size); +response_buf, response_buf_size, 0); } static inline void intel_guc_to_host_event_handler(struct intel_guc *guc) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c index 43e03aa2dde8..fb825cc1d090 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c @@ -3,6 +3,8 @@ * Copyright © 2016-2019 Intel Corporation */ +#include + #include "i915_drv.h" #include "intel_guc_ct.h" #include "gt/intel_gt.h" @@ -373,7 +375,7 @@ static void write_barrier(struct intel_guc_ct *ct) static int ct_write(struct intel_guc_ct *ct, const u32 *action, u32 len /* in dwords */, - u32 fence) + u32 fence, u32 flags) { struct intel_guc_ct_buffer *ctb = &ct->ctbs.send; struct guc_ct_buffer_desc *desc = ctb->desc; @@ -409,7 +411,7 @@ static int ct_write(struct intel_guc_ct *ct, used = tail - head; /* make sure there is a space including extra dw for the fence */ - if (unlikely(used + len + 1 >= size)) + if (unlikely(used + len + GUC_CTB_HDR_LEN >= size)) return -ENOSPC; /* @@ -421,9 +423,13 @@ static int ct_write(struct intel_guc_ct *ct, FIELD_PREP(GUC_CTB_MSG_0_NUM_DWORDS, len) | FIELD_PREP(GUC_CTB_MSG_0_FENCE, fence); - hxg = FIELD_PREP(GUC_HXG_MSG_0_TYPE, GUC_HXG_TYPE_REQUEST) | - FIELD_PREP(GUC_HXG_REQUEST_MSG_0_ACTION | -GUC_HXG_REQUEST_MSG_0_DATA0, action[0]); + hxg = (flags & INTEL_GUC_CT_SEND_NB) ? + (FIELD_PREP(GUC_HXG_MSG_0_TYPE, GUC_HXG_TYPE_EVENT) | +FIELD_PREP(GUC_HXG_EVENT_MSG_0_ACTION | + GUC_HXG_EVENT_MSG_0_DATA0, action[0])) : + (FIELD_PREP(GUC_HXG_MSG_0_TYPE, GUC_HXG_TY
[PATCH 6/7] drm/i915/guc: Optimize CTB writes and reads
CTB writes are now in the path of command submission and should be optimized for performance. Rather than reading CTB descriptor values (e.g. head, tail) which could result in accesses across the PCIe bus, store shadow local copies and only read/write the descriptor values when absolutely necessary. Also store the current space in the each channel locally. v2: (Michel) - Add additional sanity checks for head / tail pointers - Use GUC_CTB_HDR_LEN rather than magic 1 Signed-off-by: John Harrison Signed-off-by: Matthew Brost --- drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 88 +++ drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h | 6 ++ 2 files changed, 65 insertions(+), 29 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c index a9cb7b608520..5b8b4ff609e2 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c @@ -130,6 +130,10 @@ static void guc_ct_buffer_desc_init(struct guc_ct_buffer_desc *desc) static void guc_ct_buffer_reset(struct intel_guc_ct_buffer *ctb) { ctb->broken = false; + ctb->tail = 0; + ctb->head = 0; + ctb->space = CIRC_SPACE(ctb->tail, ctb->head, ctb->size); + guc_ct_buffer_desc_init(ctb->desc); } @@ -383,10 +387,8 @@ static int ct_write(struct intel_guc_ct *ct, { struct intel_guc_ct_buffer *ctb = &ct->ctbs.send; struct guc_ct_buffer_desc *desc = ctb->desc; - u32 head = desc->head; - u32 tail = desc->tail; + u32 tail = ctb->tail; u32 size = ctb->size; - u32 used; u32 header; u32 hxg; u32 *cmds = ctb->cmds; @@ -395,25 +397,22 @@ static int ct_write(struct intel_guc_ct *ct, if (unlikely(desc->status)) goto corrupted; - if (unlikely((tail | head) >= size)) { + GEM_BUG_ON(tail > size); + +#ifdef CONFIG_DRM_I915_DEBUG_GUC + if (unlikely(tail != READ_ONCE(desc->tail))) { + CT_ERROR(ct, "Tail was modified %u != %u\n", +desc->tail, ctb->tail); + desc->status |= GUC_CTB_STATUS_MISMATCH; + goto corrupted; + } + if (unlikely((desc->tail | desc->head) >= size)) { CT_ERROR(ct, "Invalid offsets head=%u tail=%u (size=%u)\n", -head, tail, size); +desc->head, desc->tail, size); desc->status |= GUC_CTB_STATUS_OVERFLOW; goto corrupted; } - - /* -* tail == head condition indicates empty. GuC FW does not support -* using up the entire buffer to get tail == head meaning full. -*/ - if (tail < head) - used = (size - head) + tail; - else - used = tail - head; - - /* make sure there is a space including extra dw for the fence */ - if (unlikely(used + len + GUC_CTB_HDR_LEN >= size)) - return -ENOSPC; +#endif /* * dw0: CT header (including fence) @@ -454,7 +453,9 @@ static int ct_write(struct intel_guc_ct *ct, write_barrier(ct); /* now update descriptor */ + ctb->tail = tail; WRITE_ONCE(desc->tail, tail); + ctb->space -= len + GUC_CTB_HDR_LEN; return 0; @@ -470,7 +471,7 @@ static int ct_write(struct intel_guc_ct *ct, * @req: pointer to pending request * @status:placeholder for status * - * For each sent request, Guc shall send bac CT response message. + * For each sent request, GuC shall send back CT response message. * Our message handler will update status of tracked request once * response message with given fence is received. Wait here and * check for valid response status value. @@ -526,24 +527,35 @@ static inline bool ct_deadlocked(struct intel_guc_ct *ct) return ret; } -static inline bool h2g_has_room(struct intel_guc_ct_buffer *ctb, u32 len_dw) +static inline bool h2g_has_room(struct intel_guc_ct *ct, u32 len_dw) { - struct guc_ct_buffer_desc *desc = ctb->desc; - u32 head = READ_ONCE(desc->head); + struct intel_guc_ct_buffer *ctb = &ct->ctbs.send; + u32 head; u32 space; - space = CIRC_SPACE(desc->tail, head, ctb->size); + if (ctb->space >= len_dw) + return true; + + head = READ_ONCE(ctb->desc->head); + if (unlikely(head > ctb->size)) { + CT_ERROR(ct, "Corrupted descriptor head=%u tail=%u size=%u\n", +ctb->desc->head, ctb->desc->tail, ctb->size); + ctb->desc->status |= GUC_CTB_STATUS_OVERFLOW; + ctb->broken = true; + return false; + } + + space = CIRC_SPACE(ctb->tail, head, ctb->size); + ctb->space = space; return space >= len_dw; } static int has_room_nb(struct intel_guc_ct *ct, u32 len_dw) { - struct intel_guc_ct_buffer *ctb = &ct->ctbs.send; - lockdep_a
[PATCH 7/7] drm/i915/guc: Module load failure test for CT buffer creation
From: John Harrison Add several module failure load inject points in the CT buffer creation code path. Signed-off-by: John Harrison Signed-off-by: Matthew Brost Reviewed-by: Michal Wajdeczko --- drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 8 1 file changed, 8 insertions(+) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c index 5b8b4ff609e2..d2a55521ef25 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c @@ -175,6 +175,10 @@ static int ct_register_buffer(struct intel_guc_ct *ct, u32 type, { int err; + err = i915_inject_probe_error(guc_to_gt(ct_to_guc(ct))->i915, -ENXIO); + if (unlikely(err)) + return err; + err = guc_action_register_ct_buffer(ct_to_guc(ct), type, desc_addr, buff_addr, size); if (unlikely(err)) @@ -226,6 +230,10 @@ int intel_guc_ct_init(struct intel_guc_ct *ct) u32 *cmds; int err; + err = i915_inject_probe_error(guc_to_gt(guc)->i915, -ENXIO); + if (err) + return err; + GEM_BUG_ON(ct->vma); blob_size = 2 * CTB_DESC_SIZE + CTB_H2G_BUFFER_SIZE + CTB_G2H_BUFFER_SIZE; -- 2.28.0
[PATCH 2/7] drm/i915/guc: Improve error message for unsolicited CT response
Improve the error message when a unsolicited CT response is received by printing fence that couldn't be found, the last fence, and all requests with a response outstanding. Signed-off-by: Matthew Brost Reviewed-by: Michal Wajdeczko --- drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 10 +++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c index b86575b99537..80db59b45c45 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c @@ -732,12 +732,16 @@ static int ct_handle_response(struct intel_guc_ct *ct, struct ct_incoming_msg *r found = true; break; } - spin_unlock_irqrestore(&ct->requests.lock, flags); - if (!found) { CT_ERROR(ct, "Unsolicited response (fence %u)\n", fence); - return -ENOKEY; + CT_ERROR(ct, "Could not find fence=%u, last_fence=%u\n", fence, +ct->requests.last_fence); + list_for_each_entry(req, &ct->requests.pending, link) + CT_ERROR(ct, "request %u awaits response\n", +req->fence); + err = -ENOKEY; } + spin_unlock_irqrestore(&ct->requests.lock, flags); if (unlikely(err)) return err; -- 2.28.0
[PATCH 5/7] drm/i915/guc: Add stall timer to non blocking CTB send function
Implement a stall timer which fails H2G CTBs once a period of time with no forward progress is reached to prevent deadlock. v2: (Michal) - Improve error message in ct_deadlock() - Set broken when ct_deadlock() returns true - Return -EPIPE on ct_deadlock() v3: (Michal) - Add ms to stall timer comment (Matthew) - Move broken check to intel_guc_ct_send() Signed-off-by: John Harrison Signed-off-by: Daniele Ceraolo Spurio Signed-off-by: Matthew Brost --- drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 62 --- drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h | 4 ++ 2 files changed, 59 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c index fb825cc1d090..a9cb7b608520 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c @@ -4,6 +4,9 @@ */ #include +#include +#include +#include #include "i915_drv.h" #include "intel_guc_ct.h" @@ -316,6 +319,7 @@ int intel_guc_ct_enable(struct intel_guc_ct *ct) goto err_deregister; ct->enabled = true; + ct->stall_time = KTIME_MAX; return 0; @@ -388,9 +392,6 @@ static int ct_write(struct intel_guc_ct *ct, u32 *cmds = ctb->cmds; unsigned int i; - if (unlikely(ctb->broken)) - return -EPIPE; - if (unlikely(desc->status)) goto corrupted; @@ -506,6 +507,25 @@ static int wait_for_ct_request_update(struct ct_request *req, u32 *status) return err; } +#define GUC_CTB_TIMEOUT_MS 1500 +static inline bool ct_deadlocked(struct intel_guc_ct *ct) +{ + long timeout = GUC_CTB_TIMEOUT_MS; + bool ret = ktime_ms_delta(ktime_get(), ct->stall_time) > timeout; + + if (unlikely(ret)) { + struct guc_ct_buffer_desc *send = ct->ctbs.send.desc; + struct guc_ct_buffer_desc *recv = ct->ctbs.send.desc; + + CT_ERROR(ct, "Communication stalled for %lld ms, desc status=%#x,%#x\n", +ktime_ms_delta(ktime_get(), ct->stall_time), +send->status, recv->status); + ct->ctbs.send.broken = true; + } + + return ret; +} + static inline bool h2g_has_room(struct intel_guc_ct_buffer *ctb, u32 len_dw) { struct guc_ct_buffer_desc *desc = ctb->desc; @@ -517,6 +537,26 @@ static inline bool h2g_has_room(struct intel_guc_ct_buffer *ctb, u32 len_dw) return space >= len_dw; } +static int has_room_nb(struct intel_guc_ct *ct, u32 len_dw) +{ + struct intel_guc_ct_buffer *ctb = &ct->ctbs.send; + + lockdep_assert_held(&ct->ctbs.send.lock); + + if (unlikely(!h2g_has_room(ctb, len_dw))) { + if (ct->stall_time == KTIME_MAX) + ct->stall_time = ktime_get(); + + if (unlikely(ct_deadlocked(ct))) + return -EPIPE; + else + return -EBUSY; + } + + ct->stall_time = KTIME_MAX; + return 0; +} + static int ct_send_nb(struct intel_guc_ct *ct, const u32 *action, u32 len, @@ -529,11 +569,9 @@ static int ct_send_nb(struct intel_guc_ct *ct, spin_lock_irqsave(&ctb->lock, spin_flags); - ret = h2g_has_room(ctb, len + GUC_CTB_HDR_LEN); - if (unlikely(!ret)) { - ret = -EBUSY; + ret = has_room_nb(ct, len + GUC_CTB_HDR_LEN); + if (unlikely(ret)) goto out; - } fence = ct_get_next_fence(ct); ret = ct_write(ct, action, len, fence, flags); @@ -576,8 +614,13 @@ static int ct_send(struct intel_guc_ct *ct, retry: spin_lock_irqsave(&ctb->lock, flags); if (unlikely(!h2g_has_room(ctb, len + GUC_CTB_HDR_LEN))) { + if (ct->stall_time == KTIME_MAX) + ct->stall_time = ktime_get(); spin_unlock_irqrestore(&ctb->lock, flags); + if (unlikely(ct_deadlocked(ct))) + return -EPIPE; + if (msleep_interruptible(sleep_period_ms)) return -EINTR; sleep_period_ms = sleep_period_ms << 1; @@ -585,6 +628,8 @@ static int ct_send(struct intel_guc_ct *ct, goto retry; } + ct->stall_time = KTIME_MAX; + fence = ct_get_next_fence(ct); request.fence = fence; request.status = 0; @@ -647,6 +692,9 @@ int intel_guc_ct_send(struct intel_guc_ct *ct, const u32 *action, u32 len, return -ENODEV; } + if (unlikely(ct->ctbs.send.broken)) + return -EPIPE; + if (flags & INTEL_GUC_CT_SEND_NB) return ct_send_nb(ct, action, len, flags); diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h index 5bb8bef024c8..bee03794c1eb 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
[PATCH 0/7] CT changes required for GuC submission
As part of enabling GuC submission discussed in [1], [2], and [3] we need optimize and update the CT code as this is now in the critical path of submission. This series includes the patches to do that which is the first 7 patches from [3]. The patches should have addressed all the feedback in [3] and should be ready to merge once CI returns a we get a few more RBs. v2: Fix checkpatch warning, address a couple of Michal's comments Signed-off-by: Matthew Brost [1] https://patchwork.freedesktop.org/series/89844/ [2] https://patchwork.freedesktop.org/series/91417/ [3] https://patchwork.freedesktop.org/series/91840/ John Harrison (1): drm/i915/guc: Module load failure test for CT buffer creation Matthew Brost (6): drm/i915/guc: Relax CTB response timeout drm/i915/guc: Improve error message for unsolicited CT response drm/i915/guc: Increase size of CTB buffers drm/i915/guc: Add non blocking CTB send function drm/i915/guc: Add stall timer to non blocking CTB send function drm/i915/guc: Optimize CTB writes and reads .../gt/uc/abi/guc_communication_ctb_abi.h | 3 +- drivers/gpu/drm/i915/gt/uc/intel_guc.h| 11 +- drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 250 +++--- drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h | 14 +- 4 files changed, 232 insertions(+), 46 deletions(-) -- 2.28.0
[PATCH 1/7] drm/i915/guc: Relax CTB response timeout
In upcoming patch we will allow more CTB requests to be sent in parallel to the GuC for processing, so we shouldn't assume any more that GuC will always reply without 10ms. Use bigger value hardcoded value of 1s instead. v2: Add CONFIG_DRM_I915_GUC_CTB_TIMEOUT config option v3: (Daniel Vetter) - Use hardcoded value of 1s rather than config option v4: (Michal) - Use defines for timeout values Signed-off-by: Matthew Brost Cc: Michal Wajdeczko Reviewed-by: Michal Wajdeczko --- drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 10 +++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c index 43409044528e..b86575b99537 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c @@ -474,14 +474,18 @@ static int wait_for_ct_request_update(struct ct_request *req, u32 *status) /* * Fast commands should complete in less than 10us, so sample quickly * up to that length of time, then switch to a slower sleep-wait loop. -* No GuC command should ever take longer than 10ms. +* No GuC command should ever take longer than 10ms but many GuC +* commands can be inflight at time, so use a 1s timeout on the slower +* sleep-wait loop. */ +#define GUC_CTB_RESPONSE_TIMEOUT_SHORT_MS 10 +#define GUC_CTB_RESPONSE_TIMEOUT_LONG_MS 1000 #define done \ (FIELD_GET(GUC_HXG_MSG_0_ORIGIN, READ_ONCE(req->status)) == \ GUC_HXG_ORIGIN_GUC) - err = wait_for_us(done, 10); + err = wait_for_us(done, GUC_CTB_RESPONSE_TIMEOUT_SHORT_MS); if (err) - err = wait_for(done, 10); + err = wait_for(done, GUC_CTB_RESPONSE_TIMEOUT_LONG_MS); #undef done if (unlikely(err)) -- 2.28.0
[PATCH v7 5/5] drm: protect drm_master pointers in drm_lease.c
drm_file->master pointers should be protected by drm_device.master_mutex or drm_file.master_lock when being dereferenced. However, in drm_lease.c, there are multiple instances where drm_file->master is accessed and dereferenced while neither lock is held. This makes drm_lease.c vulnerable to use-after-free bugs. We address this issue in 2 ways: 1. Add a new drm_file_get_master() function that calls drm_master_get on drm_file->master while holding on to drm_file.master_lock. Since drm_master_get increments the reference count of master, this prevents master from being freed until we unreference it with drm_master_put. 2. In each case where drm_file->master is directly accessed and eventually dereferenced in drm_lease.c, we wrap the access in a call to the new drm_file_get_master function, then unreference the master pointer once we are done using it. Reported-by: Daniel Vetter Signed-off-by: Desmond Cheong Zhi Xi Reviewed-by: Emil Velikov --- drivers/gpu/drm/drm_auth.c | 25 drivers/gpu/drm/drm_lease.c | 81 - include/drm/drm_auth.h | 1 + include/drm/drm_file.h | 6 +++ 4 files changed, 93 insertions(+), 20 deletions(-) diff --git a/drivers/gpu/drm/drm_auth.c b/drivers/gpu/drm/drm_auth.c index fe5b6adc6133..17440ee54f30 100644 --- a/drivers/gpu/drm/drm_auth.c +++ b/drivers/gpu/drm/drm_auth.c @@ -390,6 +390,31 @@ struct drm_master *drm_master_get(struct drm_master *master) } EXPORT_SYMBOL(drm_master_get); +/** + * drm_file_get_master - reference &drm_file.master of @file_priv + * @file_priv: DRM file private + * + * Increments the reference count of @file_priv's &drm_file.master and returns + * the &drm_file.master. If @file_priv has no &drm_file.master, returns NULL. + * + * Master pointers returned from this function should be unreferenced using + * drm_master_put(). + */ +struct drm_master *drm_file_get_master(struct drm_file *file_priv) +{ + struct drm_master *master = NULL; + + mutex_lock(&file_priv->master_lock); + if (!file_priv->master) + goto unlock; + master = drm_master_get(file_priv->master); + +unlock: + mutex_unlock(&file_priv->master_lock); + return master; +} +EXPORT_SYMBOL(drm_file_get_master); + static void drm_master_destroy(struct kref *kref) { struct drm_master *master = container_of(kref, struct drm_master, refcount); diff --git a/drivers/gpu/drm/drm_lease.c b/drivers/gpu/drm/drm_lease.c index 00fb433bcef1..92eac73d9001 100644 --- a/drivers/gpu/drm/drm_lease.c +++ b/drivers/gpu/drm/drm_lease.c @@ -106,10 +106,19 @@ static bool _drm_has_leased(struct drm_master *master, int id) */ bool _drm_lease_held(struct drm_file *file_priv, int id) { - if (!file_priv || !file_priv->master) + bool ret; + struct drm_master *master; + + if (!file_priv) return true; - return _drm_lease_held_master(file_priv->master, id); + master = drm_file_get_master(file_priv); + if (!master) + return true; + ret = _drm_lease_held_master(master, id); + drm_master_put(&master); + + return ret; } /** @@ -128,13 +137,22 @@ bool drm_lease_held(struct drm_file *file_priv, int id) struct drm_master *master; bool ret; - if (!file_priv || !file_priv->master || !file_priv->master->lessor) + if (!file_priv) return true; - master = file_priv->master; + master = drm_file_get_master(file_priv); + if (!master) + return true; + if (!master->lessor) { + ret = true; + goto out; + } mutex_lock(&master->dev->mode_config.idr_mutex); ret = _drm_lease_held_master(master, id); mutex_unlock(&master->dev->mode_config.idr_mutex); + +out: + drm_master_put(&master); return ret; } @@ -154,10 +172,16 @@ uint32_t drm_lease_filter_crtcs(struct drm_file *file_priv, uint32_t crtcs_in) int count_in, count_out; uint32_t crtcs_out = 0; - if (!file_priv || !file_priv->master || !file_priv->master->lessor) + if (!file_priv) return crtcs_in; - master = file_priv->master; + master = drm_file_get_master(file_priv); + if (!master) + return crtcs_in; + if (!master->lessor) { + crtcs_out = crtcs_in; + goto out; + } dev = master->dev; count_in = count_out = 0; @@ -176,6 +200,9 @@ uint32_t drm_lease_filter_crtcs(struct drm_file *file_priv, uint32_t crtcs_in) count_in++; } mutex_unlock(&master->dev->mode_config.idr_mutex); + +out: + drm_master_put(&master); return crtcs_out; } @@ -489,7 +516,7 @@ int drm_mode_create_lease_ioctl(struct drm_device *dev, size_t object_count; int ret = 0; struct idr leases; - struct drm_master *lessor = lessor_priv->master; +
[PATCH v7 4/5] drm: serialize drm_file.master with a master lock
Currently, drm_file.master pointers should be protected by drm_device.master_mutex when being dereferenced. This is because drm_file.master is not invariant for the lifetime of drm_file. If drm_file is not the creator of master, then drm_file.is_master is false, and a call to drm_setmaster_ioctl will invoke drm_new_set_master, which then allocates a new master for drm_file and puts the old master. Thus, without holding drm_device.master_mutex, the old value of drm_file.master could be freed while it is being used by another concurrent process. However, it is not always possible to lock drm_device.master_mutex to dereference drm_file.master. Through the fbdev emulation code, this might occur in a deep nest of other locks. But drm_device.master_mutex is also the outermost lock in the nesting hierarchy, so this leads to potential deadlocks. To address this, we introduce a new mutex at the bottom of the lock hierarchy that only serializes drm_file.master. With this change, the value of drm_file.master changes only when both drm_device.master_mutex and drm_file.master_lock are held. Hence, any process holding either of those locks can ensure that the value of drm_file.master will not change concurrently. Since no lock depends on the new drm_file.master_lock, when drm_file.master is dereferenced, but drm_device.master_mutex cannot be held, we can safely protect the master pointer with drm_file.master_lock. Reported-by: Daniel Vetter Signed-off-by: Desmond Cheong Zhi Xi --- Since our lock inversions were a result of dev->master_mutex being used to serialize many other things, perhaps a finer grained lock will solve the lockdep issues. drivers/gpu/drm/drm_auth.c | 10 -- drivers/gpu/drm/drm_file.c | 1 + include/drm/drm_file.h | 12 +--- 3 files changed, 18 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/drm_auth.c b/drivers/gpu/drm/drm_auth.c index ab1863c5a5a0..fe5b6adc6133 100644 --- a/drivers/gpu/drm/drm_auth.c +++ b/drivers/gpu/drm/drm_auth.c @@ -169,11 +169,14 @@ static int drm_new_set_master(struct drm_device *dev, struct drm_file *fpriv) WARN_ON(fpriv->is_master); old_master = fpriv->master; + mutex_lock(&fpriv->master_lock); fpriv->master = drm_master_create(dev); if (!fpriv->master) { fpriv->master = old_master; + mutex_unlock(&fpriv->master_lock); return -ENOMEM; } + mutex_unlock(&fpriv->master_lock); fpriv->is_master = 1; fpriv->authenticated = 1; @@ -332,10 +335,13 @@ int drm_master_open(struct drm_file *file_priv) * any master object for render clients */ mutex_lock(&dev->master_mutex); - if (!dev->master) + if (!dev->master) { ret = drm_new_set_master(dev, file_priv); - else + } else { + mutex_lock(&file_priv->master_lock); file_priv->master = drm_master_get(dev->master); + mutex_unlock(&file_priv->master_lock); + } mutex_unlock(&dev->master_mutex); return ret; diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c index d4f0bac6f8f8..8ccadfa1c752 100644 --- a/drivers/gpu/drm/drm_file.c +++ b/drivers/gpu/drm/drm_file.c @@ -176,6 +176,7 @@ struct drm_file *drm_file_alloc(struct drm_minor *minor) init_waitqueue_head(&file->event_wait); file->event_space = 4096; /* set aside 4k for event buffer */ + mutex_init(&file->master_lock); mutex_init(&file->event_read_lock); if (drm_core_check_feature(dev, DRIVER_GEM)) diff --git a/include/drm/drm_file.h b/include/drm/drm_file.h index b81b3bfb08c8..88539f93fc8e 100644 --- a/include/drm/drm_file.h +++ b/include/drm/drm_file.h @@ -226,15 +226,21 @@ struct drm_file { /** * @master: * -* Master this node is currently associated with. Only relevant if -* drm_is_primary_client() returns true. Note that this only -* matches &drm_device.master if the master is the currently active one. +* Master this node is currently associated with. Protected by struct +* &drm_device.master_mutex, and serialized by @master_lock. +* +* Only relevant if drm_is_primary_client() returns true. Note that +* this only matches &drm_device.master if the master is the currently +* active one. * * See also @authentication and @is_master and the :ref:`section on * primary nodes and authentication `. */ struct drm_master *master; + /** @master_lock: Serializes @master. */ + struct mutex master_lock; + /** @pid: Process that opened this file. */ struct pid *pid; -- 2.25.1
[PATCH v7 3/5] drm: add a locked version of drm_is_current_master
While checking the master status of the DRM file in drm_is_current_master(), the device's master mutex should be held. Without the mutex, the pointer fpriv->master may be freed concurrently by another process calling drm_setmaster_ioctl(). This could lead to use-after-free errors when the pointer is subsequently dereferenced in drm_lease_owner(). The callers of drm_is_current_master() from drm_auth.c hold the device's master mutex, but external callers do not. Hence, we implement drm_is_current_master_locked() to be used within drm_auth.c, and modify drm_is_current_master() to grab the device's master mutex before checking the master status. Reported-by: Daniel Vetter Signed-off-by: Desmond Cheong Zhi Xi Reviewed-by: Emil Velikov --- drivers/gpu/drm/drm_auth.c | 51 -- 1 file changed, 32 insertions(+), 19 deletions(-) diff --git a/drivers/gpu/drm/drm_auth.c b/drivers/gpu/drm/drm_auth.c index f00e5abdbbf4..ab1863c5a5a0 100644 --- a/drivers/gpu/drm/drm_auth.c +++ b/drivers/gpu/drm/drm_auth.c @@ -61,6 +61,35 @@ * trusted clients. */ +static bool drm_is_current_master_locked(struct drm_file *fpriv) +{ + lockdep_assert_held_once(&fpriv->minor->dev->master_mutex); + + return fpriv->is_master && drm_lease_owner(fpriv->master) == fpriv->minor->dev->master; +} + +/** + * drm_is_current_master - checks whether @priv is the current master + * @fpriv: DRM file private + * + * Checks whether @fpriv is current master on its device. This decides whether a + * client is allowed to run DRM_MASTER IOCTLs. + * + * Most of the modern IOCTL which require DRM_MASTER are for kernel modesetting + * - the current master is assumed to own the non-shareable display hardware. + */ +bool drm_is_current_master(struct drm_file *fpriv) +{ + bool ret; + + mutex_lock(&fpriv->minor->dev->master_mutex); + ret = drm_is_current_master_locked(fpriv); + mutex_unlock(&fpriv->minor->dev->master_mutex); + + return ret; +} +EXPORT_SYMBOL(drm_is_current_master); + int drm_getmagic(struct drm_device *dev, void *data, struct drm_file *file_priv) { struct drm_auth *auth = data; @@ -223,7 +252,7 @@ int drm_setmaster_ioctl(struct drm_device *dev, void *data, if (ret) goto out_unlock; - if (drm_is_current_master(file_priv)) + if (drm_is_current_master_locked(file_priv)) goto out_unlock; if (dev->master) { @@ -272,7 +301,7 @@ int drm_dropmaster_ioctl(struct drm_device *dev, void *data, if (ret) goto out_unlock; - if (!drm_is_current_master(file_priv)) { + if (!drm_is_current_master_locked(file_priv)) { ret = -EINVAL; goto out_unlock; } @@ -321,7 +350,7 @@ void drm_master_release(struct drm_file *file_priv) if (file_priv->magic) idr_remove(&file_priv->master->magic_map, file_priv->magic); - if (!drm_is_current_master(file_priv)) + if (!drm_is_current_master_locked(file_priv)) goto out; drm_legacy_lock_master_cleanup(dev, master); @@ -342,22 +371,6 @@ void drm_master_release(struct drm_file *file_priv) mutex_unlock(&dev->master_mutex); } -/** - * drm_is_current_master - checks whether @priv is the current master - * @fpriv: DRM file private - * - * Checks whether @fpriv is current master on its device. This decides whether a - * client is allowed to run DRM_MASTER IOCTLs. - * - * Most of the modern IOCTL which require DRM_MASTER are for kernel modesetting - * - the current master is assumed to own the non-shareable display hardware. - */ -bool drm_is_current_master(struct drm_file *fpriv) -{ - return fpriv->is_master && drm_lease_owner(fpriv->master) == fpriv->minor->dev->master; -} -EXPORT_SYMBOL(drm_is_current_master); - /** * drm_master_get - reference a master pointer * @master: &struct drm_master -- 2.25.1
[PATCH v7 2/5] drm: separate locks in __drm_mode_object_find
In a future patch, _drm_lease_held will dereference drm_file->master only after making a call to drm_file_get_master. This will increment the reference count of drm_file->master while holding onto a new drm_file.master_lock. In preparation for this, the call to _drm_lease_held should be moved out from the section locked by &dev->mode_config.idr_mutex. This avoids creating new lock hierarchies between &dev->mode_config.idr_mutex and &drm_file->master_lock. Reported-by: Daniel Vetter Signed-off-by: Desmond Cheong Zhi Xi --- drivers/gpu/drm/drm_mode_object.c | 10 ++ 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/drm_mode_object.c b/drivers/gpu/drm/drm_mode_object.c index b26588b52795..83e35ff3b13a 100644 --- a/drivers/gpu/drm/drm_mode_object.c +++ b/drivers/gpu/drm/drm_mode_object.c @@ -146,16 +146,18 @@ struct drm_mode_object *__drm_mode_object_find(struct drm_device *dev, if (obj && obj->id != id) obj = NULL; - if (obj && drm_mode_object_lease_required(obj->type) && - !_drm_lease_held(file_priv, obj->id)) - obj = NULL; - if (obj && obj->free_cb) { if (!kref_get_unless_zero(&obj->refcount)) obj = NULL; } mutex_unlock(&dev->mode_config.idr_mutex); + if (obj && drm_mode_object_lease_required(obj->type) && + !_drm_lease_held(file_priv, obj->id)) { + drm_mode_object_put(obj); + obj = NULL; + } + return obj; } -- 2.25.1