Re: [Intel-gfx] [PATCH] drm/i915: Reboot CI if we get wedged during driver init
On 01.07.2020 17:17, Chris Wilson wrote: > Quoting Michał Winiarski (2020-07-01 16:07:21) >> From: Michał Winiarski >> >> Getting wedged device on driver init is pretty much unrecoverable. >> Since we're running verious scenarios that may potentially hit this in typo >> CI (module reload / selftests / hotunplug), and if it happens, it means >> that we can't trust any subsequent CI results, we should just apply the >> taint to let the CI know that it should reboot (CI checks taint between >> test runs). >> >> Signed-off-by: Michał Winiarski >> Cc: Chris Wilson >> Cc: Petri Latvala >> --- >> drivers/gpu/drm/i915/gt/intel_reset.c | 2 ++ >> 1 file changed, 2 insertions(+) >> >> diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c >> b/drivers/gpu/drm/i915/gt/intel_reset.c >> index 0156f1f5c736..d27e8bb7d550 100644 >> --- a/drivers/gpu/drm/i915/gt/intel_reset.c >> +++ b/drivers/gpu/drm/i915/gt/intel_reset.c >> @@ -1360,6 +1360,8 @@ void intel_gt_set_wedged_on_init(struct intel_gt *gt) >> I915_WEDGED_ON_INIT); >> intel_gt_set_wedged(gt); >> set_bit(I915_WEDGED_ON_INIT, >reset.flags); >> + > > Ah, we don't say around here that this WEDGED_ON_INIT is non-recoverable, > could you please add a comment to that effect? > Such comment is already in WEDGED_ON_INIT description, but repeating it will definitely help >> + add_taint_for_CI(TAINT_WARN); btw, today we are tainting kernel for CI silently and from different places, so maybe it is worth to add there some debug log with __builtin_return_address() for better diagnose why we stopped CI? with typo/comment fixed, Reviewed-by: Michal Wajdeczko >> } >> >> void intel_gt_init_reset(struct intel_gt *gt) >> -- >> 2.27.0 >> > ___ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/intel-gfx > ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Reboot CI if we get wedged during driver init
Quoting Michał Winiarski (2020-07-01 16:07:21) > From: Michał Winiarski > > Getting wedged device on driver init is pretty much unrecoverable. > Since we're running verious scenarios that may potentially hit this in > CI (module reload / selftests / hotunplug), and if it happens, it means > that we can't trust any subsequent CI results, we should just apply the > taint to let the CI know that it should reboot (CI checks taint between > test runs). > > Signed-off-by: Michał Winiarski > Cc: Chris Wilson > Cc: Petri Latvala > --- > drivers/gpu/drm/i915/gt/intel_reset.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c > b/drivers/gpu/drm/i915/gt/intel_reset.c > index 0156f1f5c736..d27e8bb7d550 100644 > --- a/drivers/gpu/drm/i915/gt/intel_reset.c > +++ b/drivers/gpu/drm/i915/gt/intel_reset.c > @@ -1360,6 +1360,8 @@ void intel_gt_set_wedged_on_init(struct intel_gt *gt) > I915_WEDGED_ON_INIT); > intel_gt_set_wedged(gt); > set_bit(I915_WEDGED_ON_INIT, >reset.flags); > + Ah, we don't say around here that this WEDGED_ON_INIT is non-recoverable, could you please add a comment to that effect? > + add_taint_for_CI(TAINT_WARN); > } > > void intel_gt_init_reset(struct intel_gt *gt) > -- > 2.27.0 > ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Reboot CI if we get wedged during driver init
Quoting Michał Winiarski (2020-07-01 16:07:21) > From: Michał Winiarski > > Getting wedged device on driver init is pretty much unrecoverable. > Since we're running verious scenarios that may potentially hit this in > CI (module reload / selftests / hotunplug), and if it happens, it means > that we can't trust any subsequent CI results, we should just apply the > taint to let the CI know that it should reboot (CI checks taint between > test runs). Ok, we treat WEDGED_ON_INIT as non-recoverable [as opposed to the less wedged WEDGED]. > Signed-off-by: Michał Winiarski > Cc: Chris Wilson > Cc: Petri Latvala Reviewed-by: Chris Wilson -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] drm/i915: Reboot CI if we get wedged during driver init
From: Michał Winiarski Getting wedged device on driver init is pretty much unrecoverable. Since we're running verious scenarios that may potentially hit this in CI (module reload / selftests / hotunplug), and if it happens, it means that we can't trust any subsequent CI results, we should just apply the taint to let the CI know that it should reboot (CI checks taint between test runs). Signed-off-by: Michał Winiarski Cc: Chris Wilson Cc: Petri Latvala --- drivers/gpu/drm/i915/gt/intel_reset.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c index 0156f1f5c736..d27e8bb7d550 100644 --- a/drivers/gpu/drm/i915/gt/intel_reset.c +++ b/drivers/gpu/drm/i915/gt/intel_reset.c @@ -1360,6 +1360,8 @@ void intel_gt_set_wedged_on_init(struct intel_gt *gt) I915_WEDGED_ON_INIT); intel_gt_set_wedged(gt); set_bit(I915_WEDGED_ON_INIT, >reset.flags); + + add_taint_for_CI(TAINT_WARN); } void intel_gt_init_reset(struct intel_gt *gt) -- 2.27.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx