Re: [Intel-gfx] [PATCH v3 1/7] drm/i915: Add HPD IRQ storm detection (v4)

2013-04-16 Thread Egbert Eich
Hi Jani,

On Thu, Apr 11, 2013 at 12:32:29PM +0300, Jani Nikula wrote:
 
 Hi Egbert -
 
 Up front, I haven't been following this series or read any of the
 previous review comments. Please bear with me, and feel free to direct
 me to earlier comments if I'm in contradiction.

Sorry for the late reply - last week I was quite busy with other stuff,
this week i'm a bit preoccupied.

 
 On Tue, 09 Apr 2013, Egbert Eich e...@freedesktop.org wrote:
  From: Egbert Eich e...@suse.de
 
  Add a hotplug IRQ storm detection (triggered when a hotplug interrupt
  fires more than 5 times / sec).
 
 Okay, this is theoretical, but a display port sink could do more than
 that many hpd irq requests when connected. Which leads me to wonder in
 general if the storm detection should be different for hot plug
 vs. unplug and hpd irq events.

Agreed. 
During my tests I did not see any issues with the statistics I've 
implemented: 5/sec was an ad-hoc choice and I wanted to start with 
something simple.
I've tested it and it seemed to work, so I didn't bother to look into
this more deeply, however if you feel 5 events  / sec are too few to
really do a good distinction, we could easily increase this number.

There have been two situations where I have seen 'interrupt storms':

1. On G35: some boxes of the affected systems do not show this issue 
   at all while others see a very high load but are still usable.
   In my recollection there were in the order of some 100 interrupts/sec
   on these machines. Then there were systems which would 'stall' in
   the worker during boot due to high load. At one point the NMI 
   watchdog kicked in and stopped this mess. There the interrupts
   happened at an order of magnitude if 10k!

2. A laptop with a Sandybridge chipset where the system load went
   high at certain stages of charging levels - when the power supply
   was connected. I would assume the frequency there also was around
   some 100 / sec.

Thus if we increase the threshold frequency to some 10 / sec we
would still cover all those cases.

Some other issue I've seen is 'bouncing' during manual plugging
I've been contemplating how to address this. There are two things 
to look at:
1. multiple hotplug events due to not getting a perfect connection at first
2. EDID readout happening to early when the EDID lines are not yet fully
   and 'permanently' connected.
It might well be, that a fix for these issues might actually also adress 
the issues you are pointing out.

I have not seen them on Intel hardware - but this may be due to the 
fact that the hardware I saw it on was a separate gfx card which did 
not have the usual mounting bracket and thus the entire setup was a 
bit fragile and not really suitable for hot-plugging.
However I believe that these things might happen everywere for people
not quite used to plugging monitors too much.

 
 Have you observed difference between hot plug/unplug?

There seems to be a difference between monitor connected/not connected:

On DVI (G35) one doesn't distinguish between plug/unplug: when the hotplug
line on the connector changes state an interrupt is sent. On this system
storms only happened when a monitor was conneted - since the state of the
HPD pin is signalled thru different frequencies on a line across SDVO (in 
my recollection it was 10 vs. 20 MHz) I believe that due to cross talk
the higher(?) frequency could not always reliably be measured.

I did not have access to the laptop system and the customer was not 
patient enough to help me to debug this further with me.

Generally I think we would still adress the 'strom condition' if we
raised the threshold to 20 or 30 /sec.

What do you think?

 
 Has this been a problem on PCH split platforms, i.e. since ilk/gen5?

I've also observed this on Sandybridge - which would be past GEN5, 
wouldn't it?

I will address some of the other issues mentioned in a new patch.

Thanks a lot for looking at it!

Cheers,
Egbert.

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v3 1/7] drm/i915: Add HPD IRQ storm detection (v4)

2013-04-16 Thread Egbert Eich
Hi Jani,

I've rebased and regenerated all the patches now, as there 
were some other bikesheds i had not adressed. I've also
included all Reviewed-By: 
This should make it easier to integrate the patches.

Some comments below:


On Thu, Apr 11, 2013 at 12:32:29PM +0300, Jani Nikula wrote:
  +   struct {
  +   unsigned long hpd_last_jiffies;
  +   int hpd_cnt;
  +   enum {
  +   HPD_ENABLED = 0,
  +   HPD_DISABLED = 1,
  +   HPD_MARK_DISABLED = 2
  +   } hpd_mark;
  +   } hpd_stats[HPD_NUM_PINS];
 
 With all the hotplug stuff being added, I think it's getting time to
 group all the hotplug stuff under a struct.

I will postpone this until I address the issues that I have on my TODO.

 
  int num_pch_pll;
  int num_plane;
  diff --git a/drivers/gpu/drm/i915/i915_irq.c 
  b/drivers/gpu/drm/i915/i915_irq.c
  index 4c5bdd0..32b5527 100644
  --- a/drivers/gpu/drm/i915/i915_irq.c
  +++ b/drivers/gpu/drm/i915/i915_irq.c
  @@ -582,6 +582,37 @@ static void gen6_queue_rps_work(struct 
  drm_i915_private *dev_priv,
  queue_work(dev_priv-wq, dev_priv-rps.work);
   }
   
  +#define HPD_STORM_DETECT_PERIOD 1000
  +
  +static inline void hotplug_irq_storm_detect(struct drm_device *dev,
  +   u32 hotplug_trigger,
  +   const u32 *hpd)
 
 I think you should just add the bool return status right here instead of
 postponing until the later patch that needs it.

I thought about this and left it as it is: Returning a bool status now
will raise other questions as the logic in this patch doesn't require it.
I'd rather have a bigger patch later which will however clearly explains
the reason for the change to the function.

 
  +{
  +   drm_i915_private_t *dev_priv = dev-dev_private;
  +   unsigned long irqflags;
  +   int i;
  +
  +   spin_lock_irqsave(dev_priv-irq_lock, irqflags);
  +
  +   for (i = 1; i  HPD_NUM_PINS; i++) {
  +   if ((hpd[i]  hotplug_trigger) 
  +   dev_priv-hpd_stats[i].hpd_mark == HPD_ENABLED) {
 
 Bikeshed: You might get a nicer layout below by negating that if and
 adding continue.

Fixed.

 
  +   if (!time_in_range(jiffies, 
  dev_priv-hpd_stats[i].hpd_last_jiffies,
  + 
  dev_priv-hpd_stats[i].hpd_last_jiffies
  + + 
  msecs_to_jiffies(HPD_STORM_DETECT_PERIOD))) {
  +   dev_priv-hpd_stats[i].hpd_last_jiffies = 
  jiffies;
  +   dev_priv-hpd_stats[i].hpd_cnt = 0;
  +   } else if (dev_priv-hpd_stats[i].hpd_cnt  5) {
  +   dev_priv-hpd_stats[i].hpd_mark = 
  HPD_MARK_DISABLED;
  +   DRM_DEBUG_KMS(HPD interrupt storm detected on  
  PIN %d\n, i);
 
 Extra space before PIN.

Fixed.
 
  +   } else
  +   dev_priv-hpd_stats[i].hpd_cnt++;
 
 If one branch requires braces, then all do.

Fixed.

Cheers,
Egbert.
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v3 1/7] drm/i915: Add HPD IRQ storm detection (v4)

2013-04-11 Thread Jani Nikula

Hi Egbert -

Up front, I haven't been following this series or read any of the
previous review comments. Please bear with me, and feel free to direct
me to earlier comments if I'm in contradiction.

On Tue, 09 Apr 2013, Egbert Eich e...@freedesktop.org wrote:
 From: Egbert Eich e...@suse.de

 Add a hotplug IRQ storm detection (triggered when a hotplug interrupt
 fires more than 5 times / sec).

Okay, this is theoretical, but a display port sink could do more than
that many hpd irq requests when connected. Which leads me to wonder in
general if the storm detection should be different for hot plug
vs. unplug and hpd irq events.

 Rationale:
 Despite of the many attempts to fix the problem with noisy hotplug
 interrupt lines we are still seeing systems which have issues:
 Once cause of noise seems to be bad routing of the hotplug line
 on the board: cross talk from other signals seems to cause erronous
 hotplug interrupts. This has been documented as an erratum for the
 the i945GM chipset and thus hotplug support was disabled for this
 chipset model but others seem to have this problem, too.

 We have seen this issue on a G35 motherboard for example:
 Even different motherboards of the same model seem to behave
 differently: while some only see only around 10-100 interrupts/s
 others seem to see 5k or more.
 We've also observed a dependency on the selected video mode.

 Also on certain laptops interrupt noise seems to occur duing
 battery charging when the battery is at a certain charge levels.

Have you observed difference between hot plug/unplug?

Has this been a problem on PCH split platforms, i.e. since ilk/gen5?

 Thus we add a simple algorithm here that detects an 'interrupt storm'
 condition.

 v2: Fixed comment.
 v3: Reordered drm_i915_private: moved hpd state tracking to hotplug work 
 stuff.
 v4: Followed by Jesse Barnes to use a time_..() macro.

 Signed-off-by: Egbert Eich e...@suse.de
 Acked-by: Chris Wilson ch...@chris-wilson.co.uk
 Reviewed-by: Jesse Barnes jbar...@virtuousgeek.org
 ---
  drivers/gpu/drm/i915/i915_drv.h |  9 ++
  drivers/gpu/drm/i915/i915_irq.c | 66 
 +
  2 files changed, 63 insertions(+), 12 deletions(-)

 diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
 index 44fca0b..83fc1a6 100644
 --- a/drivers/gpu/drm/i915/i915_drv.h
 +++ b/drivers/gpu/drm/i915/i915_drv.h
 @@ -929,6 +929,15 @@ typedef struct drm_i915_private {
  
   struct work_struct hotplug_work;
   bool enable_hotplug_processing;
 + struct {
 + unsigned long hpd_last_jiffies;
 + int hpd_cnt;
 + enum {
 + HPD_ENABLED = 0,
 + HPD_DISABLED = 1,
 + HPD_MARK_DISABLED = 2
 + } hpd_mark;
 + } hpd_stats[HPD_NUM_PINS];

With all the hotplug stuff being added, I think it's getting time to
group all the hotplug stuff under a struct.

   int num_pch_pll;
   int num_plane;
 diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
 index 4c5bdd0..32b5527 100644
 --- a/drivers/gpu/drm/i915/i915_irq.c
 +++ b/drivers/gpu/drm/i915/i915_irq.c
 @@ -582,6 +582,37 @@ static void gen6_queue_rps_work(struct drm_i915_private 
 *dev_priv,
   queue_work(dev_priv-wq, dev_priv-rps.work);
  }
  
 +#define HPD_STORM_DETECT_PERIOD 1000
 +
 +static inline void hotplug_irq_storm_detect(struct drm_device *dev,
 + u32 hotplug_trigger,
 + const u32 *hpd)

I think you should just add the bool return status right here instead of
postponing until the later patch that needs it.

 +{
 + drm_i915_private_t *dev_priv = dev-dev_private;
 + unsigned long irqflags;
 + int i;
 +
 + spin_lock_irqsave(dev_priv-irq_lock, irqflags);
 +
 + for (i = 1; i  HPD_NUM_PINS; i++) {
 + if ((hpd[i]  hotplug_trigger) 
 + dev_priv-hpd_stats[i].hpd_mark == HPD_ENABLED) {

Bikeshed: You might get a nicer layout below by negating that if and
adding continue.

 + if (!time_in_range(jiffies, 
 dev_priv-hpd_stats[i].hpd_last_jiffies,
 +   
 dev_priv-hpd_stats[i].hpd_last_jiffies
 +   + 
 msecs_to_jiffies(HPD_STORM_DETECT_PERIOD))) {
 + dev_priv-hpd_stats[i].hpd_last_jiffies = 
 jiffies;
 + dev_priv-hpd_stats[i].hpd_cnt = 0;
 + } else if (dev_priv-hpd_stats[i].hpd_cnt  5) {
 + dev_priv-hpd_stats[i].hpd_mark = 
 HPD_MARK_DISABLED;
 + DRM_DEBUG_KMS(HPD interrupt storm detected on  
 PIN %d\n, i);

Extra space before PIN.

 + } else
 + dev_priv-hpd_stats[i].hpd_cnt++;

If one branch requires braces, then all do.

The above ends up adding HPD_MARK_DISABLED only after 

Re: [Intel-gfx] [PATCH v3 1/7] drm/i915: Add HPD IRQ storm detection (v4)

2013-04-11 Thread Daniel Vetter
On Thu, Apr 11, 2013 at 11:32 AM, Jani Nikula
jani.nik...@linux.intel.com wrote:
 On Tue, 09 Apr 2013, Egbert Eich e...@freedesktop.org wrote:
 From: Egbert Eich e...@suse.de

 Add a hotplug IRQ storm detection (triggered when a hotplug interrupt
 fires more than 5 times / sec).

 Okay, this is theoretical, but a display port sink could do more than
 that many hpd irq requests when connected. Which leads me to wonder in
 general if the storm detection should be different for hot plug
 vs. unplug and hpd irq events.

Yeah, I've considered what to do with short pulse hotplug events. But
otoh we don't differentiate that yet in our code and worst case we
need to increase the limits a bit maybe. I guess we just need to see
what happens in reality and then act accordingly.

 Rationale:
 Despite of the many attempts to fix the problem with noisy hotplug
 interrupt lines we are still seeing systems which have issues:
 Once cause of noise seems to be bad routing of the hotplug line
 on the board: cross talk from other signals seems to cause erronous
 hotplug interrupts. This has been documented as an erratum for the
 the i945GM chipset and thus hotplug support was disabled for this
 chipset model but others seem to have this problem, too.

 We have seen this issue on a G35 motherboard for example:
 Even different motherboards of the same model seem to behave
 differently: while some only see only around 10-100 interrupts/s
 others seem to see 5k or more.
 We've also observed a dependency on the selected video mode.

 Also on certain laptops interrupt noise seems to occur duing
 battery charging when the battery is at a certain charge levels.

 Have you observed difference between hot plug/unplug?

 Has this been a problem on PCH split platforms, i.e. since ilk/gen5?

I've discussed this with Art (iirc) and apparently the BIOS/Windows
guys have some elaborate schemes for this on all platforms. So I guess
this could happen everywhere. Note that hpd storms could also be
caused in the sink, so imo we want this everywhere.

 Thus we add a simple algorithm here that detects an 'interrupt storm'
 condition.

 v2: Fixed comment.
 v3: Reordered drm_i915_private: moved hpd state tracking to hotplug work 
 stuff.
 v4: Followed by Jesse Barnes to use a time_..() macro.

 Signed-off-by: Egbert Eich e...@suse.de
 Acked-by: Chris Wilson ch...@chris-wilson.co.uk
 Reviewed-by: Jesse Barnes jbar...@virtuousgeek.org
 ---
  drivers/gpu/drm/i915/i915_drv.h |  9 ++
  drivers/gpu/drm/i915/i915_irq.c | 66 
 +
  2 files changed, 63 insertions(+), 12 deletions(-)

 diff --git a/drivers/gpu/drm/i915/i915_drv.h 
 b/drivers/gpu/drm/i915/i915_drv.h
 index 44fca0b..83fc1a6 100644
 --- a/drivers/gpu/drm/i915/i915_drv.h
 +++ b/drivers/gpu/drm/i915/i915_drv.h
 @@ -929,6 +929,15 @@ typedef struct drm_i915_private {

   struct work_struct hotplug_work;
   bool enable_hotplug_processing;
 + struct {
 + unsigned long hpd_last_jiffies;
 + int hpd_cnt;
 + enum {
 + HPD_ENABLED = 0,
 + HPD_DISABLED = 1,
 + HPD_MARK_DISABLED = 2
 + } hpd_mark;
 + } hpd_stats[HPD_NUM_PINS];

 With all the hotplug stuff being added, I think it's getting time to
 group all the hotplug stuff under a struct.

Yeah, agreed. Though to avoid too much rebase hell I think it's better
to do that as a follow-up.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v3 1/7] drm/i915: Add HPD IRQ storm detection (v4)

2013-04-09 Thread Egbert Eich
From: Egbert Eich e...@suse.de

Add a hotplug IRQ storm detection (triggered when a hotplug interrupt
fires more than 5 times / sec).
Rationale:
Despite of the many attempts to fix the problem with noisy hotplug
interrupt lines we are still seeing systems which have issues:
Once cause of noise seems to be bad routing of the hotplug line
on the board: cross talk from other signals seems to cause erronous
hotplug interrupts. This has been documented as an erratum for the
the i945GM chipset and thus hotplug support was disabled for this
chipset model but others seem to have this problem, too.

We have seen this issue on a G35 motherboard for example:
Even different motherboards of the same model seem to behave
differently: while some only see only around 10-100 interrupts/s
others seem to see 5k or more.
We've also observed a dependency on the selected video mode.

Also on certain laptops interrupt noise seems to occur duing
battery charging when the battery is at a certain charge levels.

Thus we add a simple algorithm here that detects an 'interrupt storm'
condition.

v2: Fixed comment.
v3: Reordered drm_i915_private: moved hpd state tracking to hotplug work stuff.
v4: Followed by Jesse Barnes to use a time_..() macro.

Signed-off-by: Egbert Eich e...@suse.de
Acked-by: Chris Wilson ch...@chris-wilson.co.uk
Reviewed-by: Jesse Barnes jbar...@virtuousgeek.org
---
 drivers/gpu/drm/i915/i915_drv.h |  9 ++
 drivers/gpu/drm/i915/i915_irq.c | 66 +
 2 files changed, 63 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 44fca0b..83fc1a6 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -929,6 +929,15 @@ typedef struct drm_i915_private {
 
struct work_struct hotplug_work;
bool enable_hotplug_processing;
+   struct {
+   unsigned long hpd_last_jiffies;
+   int hpd_cnt;
+   enum {
+   HPD_ENABLED = 0,
+   HPD_DISABLED = 1,
+   HPD_MARK_DISABLED = 2
+   } hpd_mark;
+   } hpd_stats[HPD_NUM_PINS];
 
int num_pch_pll;
int num_plane;
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 4c5bdd0..32b5527 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -582,6 +582,37 @@ static void gen6_queue_rps_work(struct drm_i915_private 
*dev_priv,
queue_work(dev_priv-wq, dev_priv-rps.work);
 }
 
+#define HPD_STORM_DETECT_PERIOD 1000
+
+static inline void hotplug_irq_storm_detect(struct drm_device *dev,
+   u32 hotplug_trigger,
+   const u32 *hpd)
+{
+   drm_i915_private_t *dev_priv = dev-dev_private;
+   unsigned long irqflags;
+   int i;
+
+   spin_lock_irqsave(dev_priv-irq_lock, irqflags);
+
+   for (i = 1; i  HPD_NUM_PINS; i++) {
+   if ((hpd[i]  hotplug_trigger) 
+   dev_priv-hpd_stats[i].hpd_mark == HPD_ENABLED) {
+   if (!time_in_range(jiffies, 
dev_priv-hpd_stats[i].hpd_last_jiffies,
+ 
dev_priv-hpd_stats[i].hpd_last_jiffies
+ + 
msecs_to_jiffies(HPD_STORM_DETECT_PERIOD))) {
+   dev_priv-hpd_stats[i].hpd_last_jiffies = 
jiffies;
+   dev_priv-hpd_stats[i].hpd_cnt = 0;
+   } else if (dev_priv-hpd_stats[i].hpd_cnt  5) {
+   dev_priv-hpd_stats[i].hpd_mark = 
HPD_MARK_DISABLED;
+   DRM_DEBUG_KMS(HPD interrupt storm detected on  
PIN %d\n, i);
+   } else
+   dev_priv-hpd_stats[i].hpd_cnt++;
+   }
+   }
+
+   spin_unlock_irqrestore(dev_priv-irq_lock, irqflags);
+}
+
 static void gmbus_irq_handler(struct drm_device *dev)
 {
struct drm_i915_private *dev_priv = (drm_i915_private_t *) 
dev-dev_private;
@@ -650,13 +681,15 @@ static irqreturn_t valleyview_irq_handler(int irq, void 
*arg)
/* Consume port.  Then clear IIR or we'll miss events */
if (iir  I915_DISPLAY_PORT_INTERRUPT) {
u32 hotplug_status = I915_READ(PORT_HOTPLUG_STAT);
+   u32 hotplug_trigger = hotplug_status  
HOTPLUG_INT_STATUS_I915;
 
DRM_DEBUG_DRIVER(hotplug event received, stat 
0x%08x\n,
 hotplug_status);
-   if (hotplug_status  HOTPLUG_INT_STATUS_I915)
+   if (hotplug_trigger) {
+   hotplug_irq_storm_detect(dev, hotplug_trigger, 
hpd_status_i915);
queue_work(dev_priv-wq,
   dev_priv-hotplug_work);
-
+