Re: [linux-yocto] [v6.1/standard/preempt-rt/x86] Add drm/i915 reset -rt fix

2023-12-06 Thread Bruce Ashfield
On Wed, Dec 6, 2023 at 9:15 PM Paul Gortmaker
 wrote:
>
> Bruce,
>
> This commit has lingered on a dead end branch in linux-stable-rt since August:
>
> linux-stable-rt$git branch --contains 1a80b572f783a
>   v6.1-rt-next
> linux-stable-rt$
>
> I will contact the maintainer, but the updates are pretty slow over there.
>
> Nothing wrong with the commit - in fact it is now upstream with just a
> trivial comment change added to it.
>
> https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-stable-rt.git/commit/?id=1a80b572f783a1
>
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1e975e591a
>
> I put x86 in the Subject 'cause it will only be functional there, but it
> applies to v6.1/standard/preempt/base and so put it on one board or on
> all -rt boards ; whatver makes your maintenance life easier.
>
> It was added in v6.7 so I checked v6.5 and the -rt maintainer already
> added it there (832fa067488), so nothing to do on v6.5 (yay!)
>

I'll grab it for all of the 6.1 -rt branches, even if it really only will
be leveraged on the x86 BSPs.

Bruce


> Thanks,
> Paul
> --
>
> From 1a80b572f783a15327663bf9e7d71163976e8d6a Mon Sep 17 00:00:00 2001
> From: Tvrtko Ursulin 
> Date: Fri, 18 Aug 2023 22:45:25 -0400
> Subject: [PATCH] drm/i915: Do not disable preemption for resets
>
> [commit 40cd2835ced288789a685aa4aa7bc04b492dcd45 in linux-rt-devel]
>
> Commit ade8a0f59844 ("drm/i915: Make all GPU resets atomic") added a
> preempt disable section over the hardware reset callback to prepare the
> driver for being able to reset from atomic contexts.
>
> In retrospect I can see that the work item at a time was about removing
> the struct mutex from the reset path. Code base also briefly entertained
> the idea of doing the reset under stop_machine in order to serialize
> userspace mmap and temporary glitch in the fence registers (see
> eb8d0f5af4ec ("drm/i915: Remove GPU reset dependence on struct_mutex"),
> but that never materialized and was soon removed in 2caffbf11762
> ("drm/i915: Revoke mmaps and prevent access to fence registers across
> reset") and replaced with a SRCU based solution.
>
> As such, as far as I can see, today we still have a requirement that
> resets must not sleep (invoked from submission tasklets), but no need to
> support invoking them from a truly atomic context.
>
> Given that the preemption section is problematic on RT kernels, since the
> uncore lock becomes a sleeping lock and so is invalid in such section,
> lets try and remove it. Potential downside is that our short waits on GPU
> to complete the reset may get extended if CPU scheduling interferes, but
> in practice that probably isn't a deal breaker.
>
> In terms of mechanics, since the preemption disabled block is being
> removed we just need to replace a few of the wait_for_atomic macros into
> busy looping versions which will work (and not complain) when called from
> non-atomic sections.
>
> Signed-off-by: Tvrtko Ursulin 
> Cc: Chris Wilson 
> Cc: Paul Gortmaker 
> Cc: Sebastian Andrzej Siewior 
> Acked-by: Sebastian Andrzej Siewior 
> Link: 
> https://lore.kernel.org/r/20230705093025.3689748-1-tvrtko.ursu...@linux.intel.com
> Signed-off-by: Sebastian Andrzej Siewior 
> [PG: backport from v6.4-rt ; minor context fixup caused by b7d70b8b06ed]
> Signed-off-by: Paul Gortmaker 
> Signed-off-by: Clark Williams 
>
> diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c 
> b/drivers/gpu/drm/i915/gt/intel_reset.c
> index 10b930eaa8cb..6108a449cd19 100644
> --- a/drivers/gpu/drm/i915/gt/intel_reset.c
> +++ b/drivers/gpu/drm/i915/gt/intel_reset.c
> @@ -174,13 +174,13 @@ static int i915_do_reset(struct intel_gt *gt,
> /* Assert reset for at least 20 usec, and wait for acknowledgement. */
> pci_write_config_byte(pdev, I915_GDRST, GRDOM_RESET_ENABLE);
> udelay(50);
> -   err = wait_for_atomic(i915_in_reset(pdev), 50);
> +   err = _wait_for_atomic(i915_in_reset(pdev), 50, 0);
>
> /* Clear the reset request. */
> pci_write_config_byte(pdev, I915_GDRST, 0);
> udelay(50);
> if (!err)
> -   err = wait_for_atomic(!i915_in_reset(pdev), 50);
> +   err = _wait_for_atomic(!i915_in_reset(pdev), 50, 0);
>
> return err;
>  }
> @@ -200,7 +200,7 @@ static int g33_do_reset(struct intel_gt *gt,
> struct pci_dev *pdev = to_pci_dev(gt->i915->drm.dev);
>
> pci_write_config_byte(pdev, I915_GDRST, GRDOM_RESET_ENABLE);
> -   return wait_for_atomic(g4x_reset_complete(pdev), 50);
> +   return _wait_for_atomic(g4x_reset_complete(pdev), 50, 0);
>  }
>
>  static int g4x_do_reset(struct intel_gt *gt,
> @@ -217,7 +217,7 @@ static int g4x_do_reset(struct intel_gt *gt,
>
> pci_write_config_byte(pdev, I915_GDRST,
>   GRDOM_MEDIA | GRDOM_RESET_ENABLE);
> -   ret =  wait_for_atomic(g4x_reset_complete(pdev), 50);
> +   ret =  _wait_for_atomic(g4x_reset_complete(pdev), 50, 0);
> if 

[linux-yocto] [v6.1/standard/preempt-rt/x86] Add drm/i915 reset -rt fix

2023-12-06 Thread Paul Gortmaker via lists.yoctoproject.org
Bruce,

This commit has lingered on a dead end branch in linux-stable-rt since August:

linux-stable-rt$git branch --contains 1a80b572f783a
  v6.1-rt-next
linux-stable-rt$

I will contact the maintainer, but the updates are pretty slow over there.

Nothing wrong with the commit - in fact it is now upstream with just a
trivial comment change added to it.

https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-stable-rt.git/commit/?id=1a80b572f783a1

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1e975e591a

I put x86 in the Subject 'cause it will only be functional there, but it
applies to v6.1/standard/preempt/base and so put it on one board or on
all -rt boards ; whatver makes your maintenance life easier.

It was added in v6.7 so I checked v6.5 and the -rt maintainer already
added it there (832fa067488), so nothing to do on v6.5 (yay!)

Thanks,
Paul
--

>From 1a80b572f783a15327663bf9e7d71163976e8d6a Mon Sep 17 00:00:00 2001
From: Tvrtko Ursulin 
Date: Fri, 18 Aug 2023 22:45:25 -0400
Subject: [PATCH] drm/i915: Do not disable preemption for resets

[commit 40cd2835ced288789a685aa4aa7bc04b492dcd45 in linux-rt-devel]

Commit ade8a0f59844 ("drm/i915: Make all GPU resets atomic") added a
preempt disable section over the hardware reset callback to prepare the
driver for being able to reset from atomic contexts.

In retrospect I can see that the work item at a time was about removing
the struct mutex from the reset path. Code base also briefly entertained
the idea of doing the reset under stop_machine in order to serialize
userspace mmap and temporary glitch in the fence registers (see
eb8d0f5af4ec ("drm/i915: Remove GPU reset dependence on struct_mutex"),
but that never materialized and was soon removed in 2caffbf11762
("drm/i915: Revoke mmaps and prevent access to fence registers across
reset") and replaced with a SRCU based solution.

As such, as far as I can see, today we still have a requirement that
resets must not sleep (invoked from submission tasklets), but no need to
support invoking them from a truly atomic context.

Given that the preemption section is problematic on RT kernels, since the
uncore lock becomes a sleeping lock and so is invalid in such section,
lets try and remove it. Potential downside is that our short waits on GPU
to complete the reset may get extended if CPU scheduling interferes, but
in practice that probably isn't a deal breaker.

In terms of mechanics, since the preemption disabled block is being
removed we just need to replace a few of the wait_for_atomic macros into
busy looping versions which will work (and not complain) when called from
non-atomic sections.

Signed-off-by: Tvrtko Ursulin 
Cc: Chris Wilson 
Cc: Paul Gortmaker 
Cc: Sebastian Andrzej Siewior 
Acked-by: Sebastian Andrzej Siewior 
Link: 
https://lore.kernel.org/r/20230705093025.3689748-1-tvrtko.ursu...@linux.intel.com
Signed-off-by: Sebastian Andrzej Siewior 
[PG: backport from v6.4-rt ; minor context fixup caused by b7d70b8b06ed]
Signed-off-by: Paul Gortmaker 
Signed-off-by: Clark Williams 

diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c 
b/drivers/gpu/drm/i915/gt/intel_reset.c
index 10b930eaa8cb..6108a449cd19 100644
--- a/drivers/gpu/drm/i915/gt/intel_reset.c
+++ b/drivers/gpu/drm/i915/gt/intel_reset.c
@@ -174,13 +174,13 @@ static int i915_do_reset(struct intel_gt *gt,
/* Assert reset for at least 20 usec, and wait for acknowledgement. */
pci_write_config_byte(pdev, I915_GDRST, GRDOM_RESET_ENABLE);
udelay(50);
-   err = wait_for_atomic(i915_in_reset(pdev), 50);
+   err = _wait_for_atomic(i915_in_reset(pdev), 50, 0);
 
/* Clear the reset request. */
pci_write_config_byte(pdev, I915_GDRST, 0);
udelay(50);
if (!err)
-   err = wait_for_atomic(!i915_in_reset(pdev), 50);
+   err = _wait_for_atomic(!i915_in_reset(pdev), 50, 0);
 
return err;
 }
@@ -200,7 +200,7 @@ static int g33_do_reset(struct intel_gt *gt,
struct pci_dev *pdev = to_pci_dev(gt->i915->drm.dev);
 
pci_write_config_byte(pdev, I915_GDRST, GRDOM_RESET_ENABLE);
-   return wait_for_atomic(g4x_reset_complete(pdev), 50);
+   return _wait_for_atomic(g4x_reset_complete(pdev), 50, 0);
 }
 
 static int g4x_do_reset(struct intel_gt *gt,
@@ -217,7 +217,7 @@ static int g4x_do_reset(struct intel_gt *gt,
 
pci_write_config_byte(pdev, I915_GDRST,
  GRDOM_MEDIA | GRDOM_RESET_ENABLE);
-   ret =  wait_for_atomic(g4x_reset_complete(pdev), 50);
+   ret =  _wait_for_atomic(g4x_reset_complete(pdev), 50, 0);
if (ret) {
GT_TRACE(gt, "Wait for media reset failed\n");
goto out;
@@ -225,7 +225,7 @@ static int g4x_do_reset(struct intel_gt *gt,
 
pci_write_config_byte(pdev, I915_GDRST,
  GRDOM_RENDER | GRDOM_RESET_ENABLE);
-   ret =  wait_for_atomic(g4x_reset_complete(pdev), 50);
+   ret =