Re: [Intel-gfx] [PATCH v2] drm/i915: Wait for gen3 reset status to be asserted
Quoting Mika Kuoppala (2018-02-08 11:43:50) > Chris Wilson writes: > > > After we assert the reset request (and wait for 20us), when the device > > has been fully reset it asserts the reset-status bit. Before we stop > > requesting the reset and allow the device to return to normal, we should > > wait for the reset to be completed. (Similar to how we wait for the > > device to return to normal after deasserting the reset request.) > > > > v2: Rename i915_reset_completed() probe to not cause as much confusion. > > > > Signed-off-by: Chris Wilson > > Cc: Ville Syrjälä > > --- > > drivers/gpu/drm/i915/intel_uncore.c | 15 +++ > > 1 file changed, 11 insertions(+), 4 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/intel_uncore.c > > b/drivers/gpu/drm/i915/intel_uncore.c > > index 0fd59dd6bbb5..e09981a3113c 100644 > > --- a/drivers/gpu/drm/i915/intel_uncore.c > > +++ b/drivers/gpu/drm/i915/intel_uncore.c > > @@ -1550,24 +1550,31 @@ static void i915_stop_engines(struct > > drm_i915_private *dev_priv, > > gen3_stop_engine(engine); > > } > > > > -static bool i915_reset_complete(struct pci_dev *pdev) > > +static bool i915_in_reset(struct pci_dev *pdev) > > { > > u8 gdrst; > > > > pci_read_config_byte(pdev, I915_GDRST, &gdrst); > > - return (gdrst & GRDOM_RESET_STATUS) == 0; > > + return gdrst & GRDOM_RESET_STATUS; > > } > > > > static int i915_do_reset(struct drm_i915_private *dev_priv, unsigned > > engine_mask) > > { > > struct pci_dev *pdev = dev_priv->drm.pdev; > > + int err; > > > > - /* assert reset for at least 20 usec */ > > + /* Assert reset for at least 20 usec, and wait for acknowledgement. */ > > pci_write_config_byte(pdev, I915_GDRST, GRDOM_RESET_ENABLE); > > usleep_range(50, 200); > > + err = wait_for(i915_in_reset(pdev), 500); > > + > > + /* Clear the reset request. */ > > pci_write_config_byte(pdev, I915_GDRST, 0); > > + usleep_range(50, 200); > > + if (!err) > > + err = wait_for(!i915_in_reset(pdev), 500); > > > > Regardless of if this helps or not with pnv, it makes the > both phases clear. > > Any thoughts of starting to log the reset attempts > with timeout, even if the subsequent reset succeeds? > > Patch is, > Reviewed-by: Mika Kuoppala Although I don't expect this pair to help CI, it feels justifiable paranoia. Thanks for the review, -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v2] drm/i915: Wait for gen3 reset status to be asserted
Quoting Mika Kuoppala (2018-02-08 12:34:08) > Chris Wilson writes: > > > Quoting Mika Kuoppala (2018-02-08 11:43:50) > >> Any thoughts of starting to log the reset attempts > >> with timeout, even if the subsequent reset succeeds? > > > > If it succeeds, do we care? Capturing why it fails, sure. > > > > The question being what do we want to gain from it? Faster reset by > > removing timeout loops -- but if it does take X attempts, we can't > > really make it faster, just swap out one delay for another? > > > > It's a challenge, trying to provide the right information to solve a > > user's problem without their intervention and without any burden. > > Easier when you are chasing a problem down to know what you need. And > > likely need again in future? > > I was thinking of gauging the rest robustness. To check if > the level we are at, through CI. I would keep the timeouts and would > keep retries. > > Mainly the intent would be to find a pattern. Like if on some platform, > after test x, the first reset always timeouts would be a sign > to further robustify. But unless you automate such pattern finding, it will be lost and just annoy the next person trying to extract signal from the noise :) If you can think of it, write a test for it and let it run for a few months in CI. -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v2] drm/i915: Wait for gen3 reset status to be asserted
Chris Wilson writes: > Quoting Mika Kuoppala (2018-02-08 11:43:50) >> Any thoughts of starting to log the reset attempts >> with timeout, even if the subsequent reset succeeds? > > If it succeeds, do we care? Capturing why it fails, sure. > > The question being what do we want to gain from it? Faster reset by > removing timeout loops -- but if it does take X attempts, we can't > really make it faster, just swap out one delay for another? > > It's a challenge, trying to provide the right information to solve a > user's problem without their intervention and without any burden. > Easier when you are chasing a problem down to know what you need. And > likely need again in future? I was thinking of gauging the rest robustness. To check if the level we are at, through CI. I would keep the timeouts and would keep retries. Mainly the intent would be to find a pattern. Like if on some platform, after test x, the first reset always timeouts would be a sign to further robustify. -Mika ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v2] drm/i915: Wait for gen3 reset status to be asserted
Quoting Mika Kuoppala (2018-02-08 11:43:50) > Any thoughts of starting to log the reset attempts > with timeout, even if the subsequent reset succeeds? If it succeeds, do we care? Capturing why it fails, sure. The question being what do we want to gain from it? Faster reset by removing timeout loops -- but if it does take X attempts, we can't really make it faster, just swap out one delay for another? It's a challenge, trying to provide the right information to solve a user's problem without their intervention and without any burden. Easier when you are chasing a problem down to know what you need. And likely need again in future? -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v2] drm/i915: Wait for gen3 reset status to be asserted
Chris Wilson writes: > After we assert the reset request (and wait for 20us), when the device > has been fully reset it asserts the reset-status bit. Before we stop > requesting the reset and allow the device to return to normal, we should > wait for the reset to be completed. (Similar to how we wait for the > device to return to normal after deasserting the reset request.) > > v2: Rename i915_reset_completed() probe to not cause as much confusion. > > Signed-off-by: Chris Wilson > Cc: Ville Syrjälä > --- > drivers/gpu/drm/i915/intel_uncore.c | 15 +++ > 1 file changed, 11 insertions(+), 4 deletions(-) > > diff --git a/drivers/gpu/drm/i915/intel_uncore.c > b/drivers/gpu/drm/i915/intel_uncore.c > index 0fd59dd6bbb5..e09981a3113c 100644 > --- a/drivers/gpu/drm/i915/intel_uncore.c > +++ b/drivers/gpu/drm/i915/intel_uncore.c > @@ -1550,24 +1550,31 @@ static void i915_stop_engines(struct drm_i915_private > *dev_priv, > gen3_stop_engine(engine); > } > > -static bool i915_reset_complete(struct pci_dev *pdev) > +static bool i915_in_reset(struct pci_dev *pdev) > { > u8 gdrst; > > pci_read_config_byte(pdev, I915_GDRST, &gdrst); > - return (gdrst & GRDOM_RESET_STATUS) == 0; > + return gdrst & GRDOM_RESET_STATUS; > } > > static int i915_do_reset(struct drm_i915_private *dev_priv, unsigned > engine_mask) > { > struct pci_dev *pdev = dev_priv->drm.pdev; > + int err; > > - /* assert reset for at least 20 usec */ > + /* Assert reset for at least 20 usec, and wait for acknowledgement. */ > pci_write_config_byte(pdev, I915_GDRST, GRDOM_RESET_ENABLE); > usleep_range(50, 200); > + err = wait_for(i915_in_reset(pdev), 500); > + > + /* Clear the reset request. */ > pci_write_config_byte(pdev, I915_GDRST, 0); > + usleep_range(50, 200); > + if (!err) > + err = wait_for(!i915_in_reset(pdev), 500); > Regardless of if this helps or not with pnv, it makes the both phases clear. Any thoughts of starting to log the reset attempts with timeout, even if the subsequent reset succeeds? Patch is, Reviewed-by: Mika Kuoppala > - return wait_for(i915_reset_complete(pdev), 500); > + return err; > } > > static bool g4x_reset_complete(struct pci_dev *pdev) > -- > 2.16.1 > > ___ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/intel-gfx ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx