Re: [Intel-gfx] [PATCH 07/10] drm/i915: print Gen5+ CPU poison interrupts
On Thu, 14 Feb 2013 18:35:32 -0200 Paulo Zanoni wrote: > Hi > > 2013/2/9 Ben Widawsky : > > On Fri, 8 Feb 2013 11:42:39 -0800 > > Jesse Barnes wrote: > > > >> On Fri, 8 Feb 2013 17:35:18 -0200 > >> Paulo Zanoni wrote: > >> > >> > From: Paulo Zanoni > >> > > >> > On ILK/SNB all we need to do is to enable the "poison" bit, but > >> > on IVB/HSW we need to enable the CPU error interrupt register, > >> > which is responsible not only for poison interrupts, but also > >> > other things. This includes the "unclaimed register" interrupt, > >> > so on the IVB irq handler we now need to: (i) check whether the > >> > interrupt was triggered by an unclaimed register and (ii) mask > >> > the error interrupt bit so we don't risk generating "unclaimed > >> > register" interrupts form inside the interrupt handler. > >> > > >> > Signed-off-by: Paulo Zanoni > >> > --- > >> > >> OTOH there's nothing the user can do about it... so we might do a > >> WARN_ONCE or something here instead. But even then, I'm not sure > >> there's much *we* can do about these, as they indicate a > >> corruption in the communication between the CPU and PCH. > >> > > > > I agree with Jesse. I wouldn't bother with these. Even a WARN_ONCE > > isn't helpful since the backtrace wouldn't really be meaningful. > > Why isn't it helpful? Right now we don't even know whether this > problem happens or not, we're completely "blind" to a possible problem > that may be affecting us in some specific cases and we don't even > know. Knowing that it happens and how often it happens is IMHO > certainly better than closing our eyes and pretending it doesn't > exist. > I suppose you're right. I'm strongly of the opinion that we won't ever see this error because the system will crap out before we'd be able to get that info - of course I cannot prove that, and I don't know enough about what exactly poison means. I just think it sucks that we have yet another gen specific thing which has TBD value. I certainly won't nak it, and of course if it proves useful, I'll be most apologetic. As for the WARN being unhelpful, it's the same problem again. You're getting the notifications via interrupt, so a backtrace is useless on IVB/HSW. Perhaps it makes sense on ILK/SNB. Reinventing a "do this once" macro isn't worthwhile either, so I guess WARN_ON with the assumption that we ignore the backtrace is fine on IVB/HSW. > > > > If OTOH, you wanted to save away this information into error state; > > I could get behind that. > > > > ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 07/10] drm/i915: print Gen5+ CPU poison interrupts
Hi 2013/2/9 Ben Widawsky : > On Fri, 8 Feb 2013 11:42:39 -0800 > Jesse Barnes wrote: > >> On Fri, 8 Feb 2013 17:35:18 -0200 >> Paulo Zanoni wrote: >> >> > From: Paulo Zanoni >> > >> > On ILK/SNB all we need to do is to enable the "poison" bit, but on >> > IVB/HSW we need to enable the CPU error interrupt register, which is >> > responsible not only for poison interrupts, but also other things. >> > This includes the "unclaimed register" interrupt, so on the IVB irq >> > handler we now need to: (i) check whether the interrupt was >> > triggered by an unclaimed register and (ii) mask the error >> > interrupt bit so we don't risk generating "unclaimed register" >> > interrupts form inside the interrupt handler. >> > >> > Signed-off-by: Paulo Zanoni >> > --- >> >> OTOH there's nothing the user can do about it... so we might do a >> WARN_ONCE or something here instead. But even then, I'm not sure >> there's much *we* can do about these, as they indicate a corruption in >> the communication between the CPU and PCH. >> > > I agree with Jesse. I wouldn't bother with these. Even a WARN_ONCE > isn't helpful since the backtrace wouldn't really be meaningful. Why isn't it helpful? Right now we don't even know whether this problem happens or not, we're completely "blind" to a possible problem that may be affecting us in some specific cases and we don't even know. Knowing that it happens and how often it happens is IMHO certainly better than closing our eyes and pretending it doesn't exist. > > If OTOH, you wanted to save away this information into error state; I > could get behind that. -- Paulo Zanoni ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 07/10] drm/i915: print Gen5+ CPU poison interrupts
On Fri, 8 Feb 2013 11:42:39 -0800 Jesse Barnes wrote: > On Fri, 8 Feb 2013 17:35:18 -0200 > Paulo Zanoni wrote: > > > From: Paulo Zanoni > > > > On ILK/SNB all we need to do is to enable the "poison" bit, but on > > IVB/HSW we need to enable the CPU error interrupt register, which is > > responsible not only for poison interrupts, but also other things. > > This includes the "unclaimed register" interrupt, so on the IVB irq > > handler we now need to: (i) check whether the interrupt was > > triggered by an unclaimed register and (ii) mask the error > > interrupt bit so we don't risk generating "unclaimed register" > > interrupts form inside the interrupt handler. > > > > Signed-off-by: Paulo Zanoni > > --- > > OTOH there's nothing the user can do about it... so we might do a > WARN_ONCE or something here instead. But even then, I'm not sure > there's much *we* can do about these, as they indicate a corruption in > the communication between the CPU and PCH. > I agree with Jesse. I wouldn't bother with these. Even a WARN_ONCE isn't helpful since the backtrace wouldn't really be meaningful. If OTOH, you wanted to save away this information into error state; I could get behind that. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 07/10] drm/i915: print Gen5+ CPU poison interrupts
On Fri, 8 Feb 2013 17:54:23 -0200 Paulo Zanoni wrote: > Hi > > 2013/2/8 Jesse Barnes : > > On Fri, 8 Feb 2013 17:35:18 -0200 > > Paulo Zanoni wrote: > > > >> From: Paulo Zanoni > >> > >> On ILK/SNB all we need to do is to enable the "poison" bit, but on > >> IVB/HSW we need to enable the CPU error interrupt register, which is > >> responsible not only for poison interrupts, but also other things. > >> This includes the "unclaimed register" interrupt, so on the IVB irq > >> handler we now need to: (i) check whether the interrupt was triggered by an > >> unclaimed register and (ii) mask the error interrupt bit so we don't > >> risk generating "unclaimed register" interrupts form inside the > >> interrupt handler. > >> > >> Signed-off-by: Paulo Zanoni > >> --- > > > > OTOH there's nothing the user can do about it... so we might do a > > WARN_ONCE or something here instead. > > Well, so far I haven't seen the message. If we conclude it happens > *too much*, then we can use WARN_ONCE. > > > But even then, I'm not sure > > there's much *we* can do about these, as they indicate a corruption in > > the communication between the CPU and PCH. > > At least if we get the message we may be able to understand and/or > reproduce the problems. So far we don't even know whether the problem > is happening or not... And when there's a display bug, we don't know > if it's caused by "poison". Ok I guess the DRM_ERROR won't hurt if/until we see reports. Then we can dig in and see if keeping the message makes sense or not. -- Jesse Barnes, Intel Open Source Technology Center ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 07/10] drm/i915: print Gen5+ CPU poison interrupts
Hi 2013/2/8 Jesse Barnes : > On Fri, 8 Feb 2013 17:35:18 -0200 > Paulo Zanoni wrote: > >> From: Paulo Zanoni >> >> On ILK/SNB all we need to do is to enable the "poison" bit, but on >> IVB/HSW we need to enable the CPU error interrupt register, which is >> responsible not only for poison interrupts, but also other things. >> This includes the "unclaimed register" interrupt, so on the IVB irq >> handler we now need to: (i) check whether the interrupt was triggered by an >> unclaimed register and (ii) mask the error interrupt bit so we don't >> risk generating "unclaimed register" interrupts form inside the >> interrupt handler. >> >> Signed-off-by: Paulo Zanoni >> --- > > OTOH there's nothing the user can do about it... so we might do a > WARN_ONCE or something here instead. Well, so far I haven't seen the message. If we conclude it happens *too much*, then we can use WARN_ONCE. > But even then, I'm not sure > there's much *we* can do about these, as they indicate a corruption in > the communication between the CPU and PCH. At least if we get the message we may be able to understand and/or reproduce the problems. So far we don't even know whether the problem is happening or not... And when there's a display bug, we don't know if it's caused by "poison". > > -- > Jesse Barnes, Intel Open Source Technology Center -- Paulo Zanoni ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 07/10] drm/i915: print Gen5+ CPU poison interrupts
On Fri, 8 Feb 2013 17:35:18 -0200 Paulo Zanoni wrote: > From: Paulo Zanoni > > On ILK/SNB all we need to do is to enable the "poison" bit, but on > IVB/HSW we need to enable the CPU error interrupt register, which is > responsible not only for poison interrupts, but also other things. > This includes the "unclaimed register" interrupt, so on the IVB irq > handler we now need to: (i) check whether the interrupt was triggered by an > unclaimed register and (ii) mask the error interrupt bit so we don't > risk generating "unclaimed register" interrupts form inside the > interrupt handler. > > Signed-off-by: Paulo Zanoni > --- OTOH there's nothing the user can do about it... so we might do a WARN_ONCE or something here instead. But even then, I'm not sure there's much *we* can do about these, as they indicate a corruption in the communication between the CPU and PCH. -- Jesse Barnes, Intel Open Source Technology Center ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 07/10] drm/i915: print Gen5+ CPU poison interrupts
From: Paulo Zanoni On ILK/SNB all we need to do is to enable the "poison" bit, but on IVB/HSW we need to enable the CPU error interrupt register, which is responsible not only for poison interrupts, but also other things. This includes the "unclaimed register" interrupt, so on the IVB irq handler we now need to: (i) check whether the interrupt was triggered by an unclaimed register and (ii) mask the error interrupt bit so we don't risk generating "unclaimed register" interrupts form inside the interrupt handler. Signed-off-by: Paulo Zanoni --- drivers/gpu/drm/i915/i915_irq.c | 49 --- drivers/gpu/drm/i915/i915_reg.h |5 ++-- 2 files changed, 49 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index 10aec0e..703a08a 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -665,6 +665,17 @@ static void ibx_irq_handler(struct drm_device *dev, u32 pch_iir) DRM_DEBUG_DRIVER("PCH transcoder A underrun interrupt\n"); } +static void err_int_handler(struct drm_device *dev) +{ + drm_i915_private_t *dev_priv = (drm_i915_private_t *) dev->dev_private; + u32 err_int = I915_READ(GEN7_ERR_INT); + + if (err_int & ERR_INT_POISON) + DRM_ERROR("Poison interrupt\n"); + + I915_WRITE(GEN7_ERR_INT, err_int); +} + static void serr_int_handler(struct drm_device *dev) { drm_i915_private_t *dev_priv = (drm_i915_private_t *) dev->dev_private; @@ -715,16 +726,33 @@ static irqreturn_t ivybridge_irq_handler(int irq, void *arg) { struct drm_device *dev = (struct drm_device *) arg; drm_i915_private_t *dev_priv = (drm_i915_private_t *) dev->dev_private; - u32 de_iir, gt_iir, de_ier, pm_iir, sde_ier; + u32 de_iir, gt_iir, de_ier, pm_iir, sde_ier, de_imr; irqreturn_t ret = IRQ_NONE; int i; atomic_inc(&dev_priv->irq_received); + /* We get interrupts on unclaimed registers, so check for this before we +* do any I915_{READ,WRITE}. */ + if (IS_HASWELL(dev) && + (I915_READ_NOTRACE(FPGA_DBG) & FPGA_DBG_RM_NOCLAIM)) { + DRM_ERROR("Unclaimed register before interrupt\n"); + I915_WRITE_NOTRACE(FPGA_DBG, FPGA_DBG_RM_NOCLAIM); + } + /* disable master interrupt before clearing iir */ de_ier = I915_READ(DEIER); I915_WRITE(DEIER, de_ier & ~DE_MASTER_IRQ_CONTROL); + /* On Haswell, also mask ERR_INT because we don't want to risk +* generating "unclaimed register" interrupts from inside the interrupt +* handler. */ + de_imr = I915_READ(DEIMR); + if (IS_HASWELL(dev)) { + I915_WRITE(DEIMR, de_imr | DE_ERR_INT_IVB); + POSTING_READ(DEIMR); + } + sde_ier = I915_READ(SDEIER); I915_WRITE(SDEIER, 0); POSTING_READ(SDEIER); @@ -738,6 +766,9 @@ static irqreturn_t ivybridge_irq_handler(int irq, void *arg) de_iir = I915_READ(DEIIR); if (de_iir) { + if (de_iir & DE_ERR_INT_IVB) + err_int_handler(dev); + if (de_iir & DE_AUX_CHANNEL_A_IVB) dp_aux_irq_handler(dev); @@ -775,6 +806,11 @@ static irqreturn_t ivybridge_irq_handler(int irq, void *arg) ret = IRQ_HANDLED; } + if (IS_HASWELL(dev)) { + I915_WRITE(DEIMR, de_imr); + POSTING_READ(DEIMR); + } + I915_WRITE(DEIER, de_ier); POSTING_READ(DEIER); I915_WRITE(SDEIER, sde_ier); @@ -837,6 +873,9 @@ static irqreturn_t ironlake_irq_handler(int irq, void *arg) if (de_iir & DE_PIPEB_VBLANK) drm_handle_vblank(dev, 1); + if (de_iir & DE_POISON) + DRM_ERROR("Poison interrupt\n"); + if (de_iir & DE_PLANEA_FLIP_DONE) { intel_prepare_page_flip(dev, 0); intel_finish_page_flip_plane(dev, 0); @@ -1985,7 +2024,7 @@ static int ironlake_irq_postinstall(struct drm_device *dev) /* enable kind of interrupts always enabled */ u32 display_mask = DE_MASTER_IRQ_CONTROL | DE_GSE | DE_PCH_EVENT | DE_PLANEA_FLIP_DONE | DE_PLANEB_FLIP_DONE | - DE_AUX_CHANNEL_A; + DE_AUX_CHANNEL_A | DE_POISON; u32 render_irqs; dev_priv->irq_mask = ~display_mask; @@ -2035,12 +2074,14 @@ static int ivybridge_irq_postinstall(struct drm_device *dev) DE_PLANEC_FLIP_DONE_IVB | DE_PLANEB_FLIP_DONE_IVB | DE_PLANEA_FLIP_DONE_IVB | - DE_AUX_CHANNEL_A_IVB; + DE_AUX_CHANNEL_A_IVB | + DE_ERR_INT_IVB; u32 render_irqs; dev_priv->irq_mask = ~display_mask; /* should always can generate irq */ + I915_WRITE(GEN7_ERR_INT, I915_READ(GEN7_ERR_INT)); I