Re: [Intel-gfx] [PATCH v3 1/7] drm/i915: Add HPD IRQ storm detection (v4)
Hi Jani, On Thu, Apr 11, 2013 at 12:32:29PM +0300, Jani Nikula wrote: Hi Egbert - Up front, I haven't been following this series or read any of the previous review comments. Please bear with me, and feel free to direct me to earlier comments if I'm in contradiction. Sorry for the late reply - last week I was quite busy with other stuff, this week i'm a bit preoccupied. On Tue, 09 Apr 2013, Egbert Eich e...@freedesktop.org wrote: From: Egbert Eich e...@suse.de Add a hotplug IRQ storm detection (triggered when a hotplug interrupt fires more than 5 times / sec). Okay, this is theoretical, but a display port sink could do more than that many hpd irq requests when connected. Which leads me to wonder in general if the storm detection should be different for hot plug vs. unplug and hpd irq events. Agreed. During my tests I did not see any issues with the statistics I've implemented: 5/sec was an ad-hoc choice and I wanted to start with something simple. I've tested it and it seemed to work, so I didn't bother to look into this more deeply, however if you feel 5 events / sec are too few to really do a good distinction, we could easily increase this number. There have been two situations where I have seen 'interrupt storms': 1. On G35: some boxes of the affected systems do not show this issue at all while others see a very high load but are still usable. In my recollection there were in the order of some 100 interrupts/sec on these machines. Then there were systems which would 'stall' in the worker during boot due to high load. At one point the NMI watchdog kicked in and stopped this mess. There the interrupts happened at an order of magnitude if 10k! 2. A laptop with a Sandybridge chipset where the system load went high at certain stages of charging levels - when the power supply was connected. I would assume the frequency there also was around some 100 / sec. Thus if we increase the threshold frequency to some 10 / sec we would still cover all those cases. Some other issue I've seen is 'bouncing' during manual plugging I've been contemplating how to address this. There are two things to look at: 1. multiple hotplug events due to not getting a perfect connection at first 2. EDID readout happening to early when the EDID lines are not yet fully and 'permanently' connected. It might well be, that a fix for these issues might actually also adress the issues you are pointing out. I have not seen them on Intel hardware - but this may be due to the fact that the hardware I saw it on was a separate gfx card which did not have the usual mounting bracket and thus the entire setup was a bit fragile and not really suitable for hot-plugging. However I believe that these things might happen everywere for people not quite used to plugging monitors too much. Have you observed difference between hot plug/unplug? There seems to be a difference between monitor connected/not connected: On DVI (G35) one doesn't distinguish between plug/unplug: when the hotplug line on the connector changes state an interrupt is sent. On this system storms only happened when a monitor was conneted - since the state of the HPD pin is signalled thru different frequencies on a line across SDVO (in my recollection it was 10 vs. 20 MHz) I believe that due to cross talk the higher(?) frequency could not always reliably be measured. I did not have access to the laptop system and the customer was not patient enough to help me to debug this further with me. Generally I think we would still adress the 'strom condition' if we raised the threshold to 20 or 30 /sec. What do you think? Has this been a problem on PCH split platforms, i.e. since ilk/gen5? I've also observed this on Sandybridge - which would be past GEN5, wouldn't it? I will address some of the other issues mentioned in a new patch. Thanks a lot for looking at it! Cheers, Egbert. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v3 1/7] drm/i915: Add HPD IRQ storm detection (v4)
Hi Jani, I've rebased and regenerated all the patches now, as there were some other bikesheds i had not adressed. I've also included all Reviewed-By: This should make it easier to integrate the patches. Some comments below: On Thu, Apr 11, 2013 at 12:32:29PM +0300, Jani Nikula wrote: + struct { + unsigned long hpd_last_jiffies; + int hpd_cnt; + enum { + HPD_ENABLED = 0, + HPD_DISABLED = 1, + HPD_MARK_DISABLED = 2 + } hpd_mark; + } hpd_stats[HPD_NUM_PINS]; With all the hotplug stuff being added, I think it's getting time to group all the hotplug stuff under a struct. I will postpone this until I address the issues that I have on my TODO. int num_pch_pll; int num_plane; diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index 4c5bdd0..32b5527 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -582,6 +582,37 @@ static void gen6_queue_rps_work(struct drm_i915_private *dev_priv, queue_work(dev_priv-wq, dev_priv-rps.work); } +#define HPD_STORM_DETECT_PERIOD 1000 + +static inline void hotplug_irq_storm_detect(struct drm_device *dev, + u32 hotplug_trigger, + const u32 *hpd) I think you should just add the bool return status right here instead of postponing until the later patch that needs it. I thought about this and left it as it is: Returning a bool status now will raise other questions as the logic in this patch doesn't require it. I'd rather have a bigger patch later which will however clearly explains the reason for the change to the function. +{ + drm_i915_private_t *dev_priv = dev-dev_private; + unsigned long irqflags; + int i; + + spin_lock_irqsave(dev_priv-irq_lock, irqflags); + + for (i = 1; i HPD_NUM_PINS; i++) { + if ((hpd[i] hotplug_trigger) + dev_priv-hpd_stats[i].hpd_mark == HPD_ENABLED) { Bikeshed: You might get a nicer layout below by negating that if and adding continue. Fixed. + if (!time_in_range(jiffies, dev_priv-hpd_stats[i].hpd_last_jiffies, + dev_priv-hpd_stats[i].hpd_last_jiffies + + msecs_to_jiffies(HPD_STORM_DETECT_PERIOD))) { + dev_priv-hpd_stats[i].hpd_last_jiffies = jiffies; + dev_priv-hpd_stats[i].hpd_cnt = 0; + } else if (dev_priv-hpd_stats[i].hpd_cnt 5) { + dev_priv-hpd_stats[i].hpd_mark = HPD_MARK_DISABLED; + DRM_DEBUG_KMS(HPD interrupt storm detected on PIN %d\n, i); Extra space before PIN. Fixed. + } else + dev_priv-hpd_stats[i].hpd_cnt++; If one branch requires braces, then all do. Fixed. Cheers, Egbert. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v3 1/7] drm/i915: Add HPD IRQ storm detection (v4)
Hi Egbert - Up front, I haven't been following this series or read any of the previous review comments. Please bear with me, and feel free to direct me to earlier comments if I'm in contradiction. On Tue, 09 Apr 2013, Egbert Eich e...@freedesktop.org wrote: From: Egbert Eich e...@suse.de Add a hotplug IRQ storm detection (triggered when a hotplug interrupt fires more than 5 times / sec). Okay, this is theoretical, but a display port sink could do more than that many hpd irq requests when connected. Which leads me to wonder in general if the storm detection should be different for hot plug vs. unplug and hpd irq events. Rationale: Despite of the many attempts to fix the problem with noisy hotplug interrupt lines we are still seeing systems which have issues: Once cause of noise seems to be bad routing of the hotplug line on the board: cross talk from other signals seems to cause erronous hotplug interrupts. This has been documented as an erratum for the the i945GM chipset and thus hotplug support was disabled for this chipset model but others seem to have this problem, too. We have seen this issue on a G35 motherboard for example: Even different motherboards of the same model seem to behave differently: while some only see only around 10-100 interrupts/s others seem to see 5k or more. We've also observed a dependency on the selected video mode. Also on certain laptops interrupt noise seems to occur duing battery charging when the battery is at a certain charge levels. Have you observed difference between hot plug/unplug? Has this been a problem on PCH split platforms, i.e. since ilk/gen5? Thus we add a simple algorithm here that detects an 'interrupt storm' condition. v2: Fixed comment. v3: Reordered drm_i915_private: moved hpd state tracking to hotplug work stuff. v4: Followed by Jesse Barnes to use a time_..() macro. Signed-off-by: Egbert Eich e...@suse.de Acked-by: Chris Wilson ch...@chris-wilson.co.uk Reviewed-by: Jesse Barnes jbar...@virtuousgeek.org --- drivers/gpu/drm/i915/i915_drv.h | 9 ++ drivers/gpu/drm/i915/i915_irq.c | 66 + 2 files changed, 63 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 44fca0b..83fc1a6 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -929,6 +929,15 @@ typedef struct drm_i915_private { struct work_struct hotplug_work; bool enable_hotplug_processing; + struct { + unsigned long hpd_last_jiffies; + int hpd_cnt; + enum { + HPD_ENABLED = 0, + HPD_DISABLED = 1, + HPD_MARK_DISABLED = 2 + } hpd_mark; + } hpd_stats[HPD_NUM_PINS]; With all the hotplug stuff being added, I think it's getting time to group all the hotplug stuff under a struct. int num_pch_pll; int num_plane; diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index 4c5bdd0..32b5527 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -582,6 +582,37 @@ static void gen6_queue_rps_work(struct drm_i915_private *dev_priv, queue_work(dev_priv-wq, dev_priv-rps.work); } +#define HPD_STORM_DETECT_PERIOD 1000 + +static inline void hotplug_irq_storm_detect(struct drm_device *dev, + u32 hotplug_trigger, + const u32 *hpd) I think you should just add the bool return status right here instead of postponing until the later patch that needs it. +{ + drm_i915_private_t *dev_priv = dev-dev_private; + unsigned long irqflags; + int i; + + spin_lock_irqsave(dev_priv-irq_lock, irqflags); + + for (i = 1; i HPD_NUM_PINS; i++) { + if ((hpd[i] hotplug_trigger) + dev_priv-hpd_stats[i].hpd_mark == HPD_ENABLED) { Bikeshed: You might get a nicer layout below by negating that if and adding continue. + if (!time_in_range(jiffies, dev_priv-hpd_stats[i].hpd_last_jiffies, + dev_priv-hpd_stats[i].hpd_last_jiffies + + msecs_to_jiffies(HPD_STORM_DETECT_PERIOD))) { + dev_priv-hpd_stats[i].hpd_last_jiffies = jiffies; + dev_priv-hpd_stats[i].hpd_cnt = 0; + } else if (dev_priv-hpd_stats[i].hpd_cnt 5) { + dev_priv-hpd_stats[i].hpd_mark = HPD_MARK_DISABLED; + DRM_DEBUG_KMS(HPD interrupt storm detected on PIN %d\n, i); Extra space before PIN. + } else + dev_priv-hpd_stats[i].hpd_cnt++; If one branch requires braces, then all do. The above ends up adding HPD_MARK_DISABLED only after
Re: [Intel-gfx] [PATCH v3 1/7] drm/i915: Add HPD IRQ storm detection (v4)
On Thu, Apr 11, 2013 at 11:32 AM, Jani Nikula jani.nik...@linux.intel.com wrote: On Tue, 09 Apr 2013, Egbert Eich e...@freedesktop.org wrote: From: Egbert Eich e...@suse.de Add a hotplug IRQ storm detection (triggered when a hotplug interrupt fires more than 5 times / sec). Okay, this is theoretical, but a display port sink could do more than that many hpd irq requests when connected. Which leads me to wonder in general if the storm detection should be different for hot plug vs. unplug and hpd irq events. Yeah, I've considered what to do with short pulse hotplug events. But otoh we don't differentiate that yet in our code and worst case we need to increase the limits a bit maybe. I guess we just need to see what happens in reality and then act accordingly. Rationale: Despite of the many attempts to fix the problem with noisy hotplug interrupt lines we are still seeing systems which have issues: Once cause of noise seems to be bad routing of the hotplug line on the board: cross talk from other signals seems to cause erronous hotplug interrupts. This has been documented as an erratum for the the i945GM chipset and thus hotplug support was disabled for this chipset model but others seem to have this problem, too. We have seen this issue on a G35 motherboard for example: Even different motherboards of the same model seem to behave differently: while some only see only around 10-100 interrupts/s others seem to see 5k or more. We've also observed a dependency on the selected video mode. Also on certain laptops interrupt noise seems to occur duing battery charging when the battery is at a certain charge levels. Have you observed difference between hot plug/unplug? Has this been a problem on PCH split platforms, i.e. since ilk/gen5? I've discussed this with Art (iirc) and apparently the BIOS/Windows guys have some elaborate schemes for this on all platforms. So I guess this could happen everywhere. Note that hpd storms could also be caused in the sink, so imo we want this everywhere. Thus we add a simple algorithm here that detects an 'interrupt storm' condition. v2: Fixed comment. v3: Reordered drm_i915_private: moved hpd state tracking to hotplug work stuff. v4: Followed by Jesse Barnes to use a time_..() macro. Signed-off-by: Egbert Eich e...@suse.de Acked-by: Chris Wilson ch...@chris-wilson.co.uk Reviewed-by: Jesse Barnes jbar...@virtuousgeek.org --- drivers/gpu/drm/i915/i915_drv.h | 9 ++ drivers/gpu/drm/i915/i915_irq.c | 66 + 2 files changed, 63 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 44fca0b..83fc1a6 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -929,6 +929,15 @@ typedef struct drm_i915_private { struct work_struct hotplug_work; bool enable_hotplug_processing; + struct { + unsigned long hpd_last_jiffies; + int hpd_cnt; + enum { + HPD_ENABLED = 0, + HPD_DISABLED = 1, + HPD_MARK_DISABLED = 2 + } hpd_mark; + } hpd_stats[HPD_NUM_PINS]; With all the hotplug stuff being added, I think it's getting time to group all the hotplug stuff under a struct. Yeah, agreed. Though to avoid too much rebase hell I think it's better to do that as a follow-up. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v3 1/7] drm/i915: Add HPD IRQ storm detection (v4)
From: Egbert Eich e...@suse.de Add a hotplug IRQ storm detection (triggered when a hotplug interrupt fires more than 5 times / sec). Rationale: Despite of the many attempts to fix the problem with noisy hotplug interrupt lines we are still seeing systems which have issues: Once cause of noise seems to be bad routing of the hotplug line on the board: cross talk from other signals seems to cause erronous hotplug interrupts. This has been documented as an erratum for the the i945GM chipset and thus hotplug support was disabled for this chipset model but others seem to have this problem, too. We have seen this issue on a G35 motherboard for example: Even different motherboards of the same model seem to behave differently: while some only see only around 10-100 interrupts/s others seem to see 5k or more. We've also observed a dependency on the selected video mode. Also on certain laptops interrupt noise seems to occur duing battery charging when the battery is at a certain charge levels. Thus we add a simple algorithm here that detects an 'interrupt storm' condition. v2: Fixed comment. v3: Reordered drm_i915_private: moved hpd state tracking to hotplug work stuff. v4: Followed by Jesse Barnes to use a time_..() macro. Signed-off-by: Egbert Eich e...@suse.de Acked-by: Chris Wilson ch...@chris-wilson.co.uk Reviewed-by: Jesse Barnes jbar...@virtuousgeek.org --- drivers/gpu/drm/i915/i915_drv.h | 9 ++ drivers/gpu/drm/i915/i915_irq.c | 66 + 2 files changed, 63 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 44fca0b..83fc1a6 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -929,6 +929,15 @@ typedef struct drm_i915_private { struct work_struct hotplug_work; bool enable_hotplug_processing; + struct { + unsigned long hpd_last_jiffies; + int hpd_cnt; + enum { + HPD_ENABLED = 0, + HPD_DISABLED = 1, + HPD_MARK_DISABLED = 2 + } hpd_mark; + } hpd_stats[HPD_NUM_PINS]; int num_pch_pll; int num_plane; diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index 4c5bdd0..32b5527 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -582,6 +582,37 @@ static void gen6_queue_rps_work(struct drm_i915_private *dev_priv, queue_work(dev_priv-wq, dev_priv-rps.work); } +#define HPD_STORM_DETECT_PERIOD 1000 + +static inline void hotplug_irq_storm_detect(struct drm_device *dev, + u32 hotplug_trigger, + const u32 *hpd) +{ + drm_i915_private_t *dev_priv = dev-dev_private; + unsigned long irqflags; + int i; + + spin_lock_irqsave(dev_priv-irq_lock, irqflags); + + for (i = 1; i HPD_NUM_PINS; i++) { + if ((hpd[i] hotplug_trigger) + dev_priv-hpd_stats[i].hpd_mark == HPD_ENABLED) { + if (!time_in_range(jiffies, dev_priv-hpd_stats[i].hpd_last_jiffies, + dev_priv-hpd_stats[i].hpd_last_jiffies + + msecs_to_jiffies(HPD_STORM_DETECT_PERIOD))) { + dev_priv-hpd_stats[i].hpd_last_jiffies = jiffies; + dev_priv-hpd_stats[i].hpd_cnt = 0; + } else if (dev_priv-hpd_stats[i].hpd_cnt 5) { + dev_priv-hpd_stats[i].hpd_mark = HPD_MARK_DISABLED; + DRM_DEBUG_KMS(HPD interrupt storm detected on PIN %d\n, i); + } else + dev_priv-hpd_stats[i].hpd_cnt++; + } + } + + spin_unlock_irqrestore(dev_priv-irq_lock, irqflags); +} + static void gmbus_irq_handler(struct drm_device *dev) { struct drm_i915_private *dev_priv = (drm_i915_private_t *) dev-dev_private; @@ -650,13 +681,15 @@ static irqreturn_t valleyview_irq_handler(int irq, void *arg) /* Consume port. Then clear IIR or we'll miss events */ if (iir I915_DISPLAY_PORT_INTERRUPT) { u32 hotplug_status = I915_READ(PORT_HOTPLUG_STAT); + u32 hotplug_trigger = hotplug_status HOTPLUG_INT_STATUS_I915; DRM_DEBUG_DRIVER(hotplug event received, stat 0x%08x\n, hotplug_status); - if (hotplug_status HOTPLUG_INT_STATUS_I915) + if (hotplug_trigger) { + hotplug_irq_storm_detect(dev, hotplug_trigger, hpd_status_i915); queue_work(dev_priv-wq, dev_priv-hotplug_work); - +