Re: [PATCH v3 2/2] irqchip: dw-apb-ictl: add irq_set_affinity support

2015-07-06 Thread Jisheng Zhang
Dear Thomas,

On Mon, 6 Jul 2015 15:51:28 +0200
Thomas Gleixner  wrote:

> On Mon, 6 Jul 2015, Jisheng Zhang wrote:
> > On Mon, 6 Jul 2015 12:30:01 +0200
> > Thomas Gleixner  wrote:
> > 
> > > On Mon, 6 Jul 2015, Jisheng Zhang wrote:
> > > > +static int dw_apb_ictl_set_affinity(struct irq_data *d,
> > > > +   const struct cpumask *mask_val,
> > > > +   bool force)
> > > > +{
> > > > +   struct irq_chip_generic *gc = irq_data_get_irq_chip_data(d);
> > > > +   struct dw_apb_ictl_priv *priv = gc->private;
> > > > +   struct irq_chip *chip = irq_get_chip(priv->parent_irq);
> > > > +   struct irq_data *data = irq_get_irq_data(priv->parent_irq);
> > > > +
> > > > +   if (chip && chip->irq_set_affinity)
> > > > +   return chip->irq_set_affinity(data, mask_val, force);
> > > 
> > > This is wrong as it lacks proper locking of the parent irq. That needs
> > > to be solved at the core code level in a clean way.
> > 
> > Is it acceptable to call irq_set_affinity() or irq_force_affinity() as the
> > following:
> > 
> > if (force)
> > return irq_force_affinity(priv->parent_irq, mask_val);
> > else
> > return irq_set_affinity(priv->parent_irq, mask_val);
> 
> Not from the driver, as you run into lock nesting hell. As I said,
> this needs to be solved at the core code level and needs a proper
> thought out design.

Got it. Thanks for the clarification.

> 
> Just for the record: I'm not too happy about that 'fiddle with the
> parent' mechanism because it opens just a large can of worms. I wish
> hardware designers would talk to OS people before they implement random
> nonsense.
> 

Fully agree with you. I'm requesting our HW people to connect timer to GIC
directly in future chips. But in existing chips, it seems we have to wait
for core code ready or use this lack of proper locking set_affinity patch
ourself.

Thanks,
Jisheng
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 2/2] irqchip: dw-apb-ictl: add irq_set_affinity support

2015-07-06 Thread Thomas Gleixner
On Mon, 6 Jul 2015, Jisheng Zhang wrote:
> On Mon, 6 Jul 2015 12:30:01 +0200
> Thomas Gleixner  wrote:
> 
> > On Mon, 6 Jul 2015, Jisheng Zhang wrote:
> > > +static int dw_apb_ictl_set_affinity(struct irq_data *d,
> > > + const struct cpumask *mask_val,
> > > + bool force)
> > > +{
> > > + struct irq_chip_generic *gc = irq_data_get_irq_chip_data(d);
> > > + struct dw_apb_ictl_priv *priv = gc->private;
> > > + struct irq_chip *chip = irq_get_chip(priv->parent_irq);
> > > + struct irq_data *data = irq_get_irq_data(priv->parent_irq);
> > > +
> > > + if (chip && chip->irq_set_affinity)
> > > + return chip->irq_set_affinity(data, mask_val, force);
> > 
> > This is wrong as it lacks proper locking of the parent irq. That needs
> > to be solved at the core code level in a clean way.
> 
> Is it acceptable to call irq_set_affinity() or irq_force_affinity() as the
> following:
> 
> if (force)
>   return irq_force_affinity(priv->parent_irq, mask_val);
> else
>   return irq_set_affinity(priv->parent_irq, mask_val);

Not from the driver, as you run into lock nesting hell. As I said,
this needs to be solved at the core code level and needs a proper
thought out design.

Just for the record: I'm not too happy about that 'fiddle with the
parent' mechanism because it opens just a large can of worms. I wish
hardware designers would talk to OS people before they implement random
nonsense.

Thanks,

tglx


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 2/2] irqchip: dw-apb-ictl: add irq_set_affinity support

2015-07-06 Thread Jisheng Zhang
On Mon, 6 Jul 2015 12:30:01 +0200
Thomas Gleixner  wrote:

> On Mon, 6 Jul 2015, Jisheng Zhang wrote:
> > +static int dw_apb_ictl_set_affinity(struct irq_data *d,
> > +   const struct cpumask *mask_val,
> > +   bool force)
> > +{
> > +   struct irq_chip_generic *gc = irq_data_get_irq_chip_data(d);
> > +   struct dw_apb_ictl_priv *priv = gc->private;
> > +   struct irq_chip *chip = irq_get_chip(priv->parent_irq);
> > +   struct irq_data *data = irq_get_irq_data(priv->parent_irq);
> > +
> > +   if (chip && chip->irq_set_affinity)
> > +   return chip->irq_set_affinity(data, mask_val, force);
> 
> This is wrong as it lacks proper locking of the parent irq. That needs
> to be solved at the core code level in a clean way.

Is it acceptable to call irq_set_affinity() or irq_force_affinity() as the
following:

if (force)
return irq_force_affinity(priv->parent_irq, mask_val);
else
return irq_set_affinity(priv->parent_irq, mask_val);

Thanks,
Jisheng
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 2/2] irqchip: dw-apb-ictl: add irq_set_affinity support

2015-07-06 Thread Thomas Gleixner
On Mon, 6 Jul 2015, Jisheng Zhang wrote:
> +static int dw_apb_ictl_set_affinity(struct irq_data *d,
> + const struct cpumask *mask_val,
> + bool force)
> +{
> + struct irq_chip_generic *gc = irq_data_get_irq_chip_data(d);
> + struct dw_apb_ictl_priv *priv = gc->private;
> + struct irq_chip *chip = irq_get_chip(priv->parent_irq);
> + struct irq_data *data = irq_get_irq_data(priv->parent_irq);
> +
> + if (chip && chip->irq_set_affinity)
> + return chip->irq_set_affinity(data, mask_val, force);

This is wrong as it lacks proper locking of the parent irq. That needs
to be solved at the core code level in a clean way.

Thanks,

tglx


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 2/2] irqchip: dw-apb-ictl: add irq_set_affinity support

2015-07-06 Thread Thomas Gleixner
On Mon, 6 Jul 2015, Jisheng Zhang wrote:
 +static int dw_apb_ictl_set_affinity(struct irq_data *d,
 + const struct cpumask *mask_val,
 + bool force)
 +{
 + struct irq_chip_generic *gc = irq_data_get_irq_chip_data(d);
 + struct dw_apb_ictl_priv *priv = gc-private;
 + struct irq_chip *chip = irq_get_chip(priv-parent_irq);
 + struct irq_data *data = irq_get_irq_data(priv-parent_irq);
 +
 + if (chip  chip-irq_set_affinity)
 + return chip-irq_set_affinity(data, mask_val, force);

This is wrong as it lacks proper locking of the parent irq. That needs
to be solved at the core code level in a clean way.

Thanks,

tglx


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 2/2] irqchip: dw-apb-ictl: add irq_set_affinity support

2015-07-06 Thread Jisheng Zhang
On Mon, 6 Jul 2015 12:30:01 +0200
Thomas Gleixner t...@linutronix.de wrote:

 On Mon, 6 Jul 2015, Jisheng Zhang wrote:
  +static int dw_apb_ictl_set_affinity(struct irq_data *d,
  +   const struct cpumask *mask_val,
  +   bool force)
  +{
  +   struct irq_chip_generic *gc = irq_data_get_irq_chip_data(d);
  +   struct dw_apb_ictl_priv *priv = gc-private;
  +   struct irq_chip *chip = irq_get_chip(priv-parent_irq);
  +   struct irq_data *data = irq_get_irq_data(priv-parent_irq);
  +
  +   if (chip  chip-irq_set_affinity)
  +   return chip-irq_set_affinity(data, mask_val, force);
 
 This is wrong as it lacks proper locking of the parent irq. That needs
 to be solved at the core code level in a clean way.

Is it acceptable to call irq_set_affinity() or irq_force_affinity() as the
following:

if (force)
return irq_force_affinity(priv-parent_irq, mask_val);
else
return irq_set_affinity(priv-parent_irq, mask_val);

Thanks,
Jisheng
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 2/2] irqchip: dw-apb-ictl: add irq_set_affinity support

2015-07-06 Thread Thomas Gleixner
On Mon, 6 Jul 2015, Jisheng Zhang wrote:
 On Mon, 6 Jul 2015 12:30:01 +0200
 Thomas Gleixner t...@linutronix.de wrote:
 
  On Mon, 6 Jul 2015, Jisheng Zhang wrote:
   +static int dw_apb_ictl_set_affinity(struct irq_data *d,
   + const struct cpumask *mask_val,
   + bool force)
   +{
   + struct irq_chip_generic *gc = irq_data_get_irq_chip_data(d);
   + struct dw_apb_ictl_priv *priv = gc-private;
   + struct irq_chip *chip = irq_get_chip(priv-parent_irq);
   + struct irq_data *data = irq_get_irq_data(priv-parent_irq);
   +
   + if (chip  chip-irq_set_affinity)
   + return chip-irq_set_affinity(data, mask_val, force);
  
  This is wrong as it lacks proper locking of the parent irq. That needs
  to be solved at the core code level in a clean way.
 
 Is it acceptable to call irq_set_affinity() or irq_force_affinity() as the
 following:
 
 if (force)
   return irq_force_affinity(priv-parent_irq, mask_val);
 else
   return irq_set_affinity(priv-parent_irq, mask_val);

Not from the driver, as you run into lock nesting hell. As I said,
this needs to be solved at the core code level and needs a proper
thought out design.

Just for the record: I'm not too happy about that 'fiddle with the
parent' mechanism because it opens just a large can of worms. I wish
hardware designers would talk to OS people before they implement random
nonsense.

Thanks,

tglx


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 2/2] irqchip: dw-apb-ictl: add irq_set_affinity support

2015-07-06 Thread Jisheng Zhang
Dear Thomas,

On Mon, 6 Jul 2015 15:51:28 +0200
Thomas Gleixner t...@linutronix.de wrote:

 On Mon, 6 Jul 2015, Jisheng Zhang wrote:
  On Mon, 6 Jul 2015 12:30:01 +0200
  Thomas Gleixner t...@linutronix.de wrote:
  
   On Mon, 6 Jul 2015, Jisheng Zhang wrote:
+static int dw_apb_ictl_set_affinity(struct irq_data *d,
+   const struct cpumask *mask_val,
+   bool force)
+{
+   struct irq_chip_generic *gc = irq_data_get_irq_chip_data(d);
+   struct dw_apb_ictl_priv *priv = gc-private;
+   struct irq_chip *chip = irq_get_chip(priv-parent_irq);
+   struct irq_data *data = irq_get_irq_data(priv-parent_irq);
+
+   if (chip  chip-irq_set_affinity)
+   return chip-irq_set_affinity(data, mask_val, force);
   
   This is wrong as it lacks proper locking of the parent irq. That needs
   to be solved at the core code level in a clean way.
  
  Is it acceptable to call irq_set_affinity() or irq_force_affinity() as the
  following:
  
  if (force)
  return irq_force_affinity(priv-parent_irq, mask_val);
  else
  return irq_set_affinity(priv-parent_irq, mask_val);
 
 Not from the driver, as you run into lock nesting hell. As I said,
 this needs to be solved at the core code level and needs a proper
 thought out design.

Got it. Thanks for the clarification.

 
 Just for the record: I'm not too happy about that 'fiddle with the
 parent' mechanism because it opens just a large can of worms. I wish
 hardware designers would talk to OS people before they implement random
 nonsense.
 

Fully agree with you. I'm requesting our HW people to connect timer to GIC
directly in future chips. But in existing chips, it seems we have to wait
for core code ready or use this lack of proper locking set_affinity patch
ourself.

Thanks,
Jisheng
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3 2/2] irqchip: dw-apb-ictl: add irq_set_affinity support

2015-07-05 Thread Jisheng Zhang
On Marvell Berlin SoCs, the cpu's local timer is shutdown when the cpu
goes to a deep idle state, then the timer framework will be notified to
use a broadcast timer instead. In our case, the broadcast timer uses
dw-apb-ictl as interrupt chip. This patch adds irq_set_affinity support
so that the going to deep idle state cpu can set the interrupt affinity
of the broadcast interrupt to avoid unnecessary wakeups and IPIs.

NOTE: We achieved this by changing the parent interrupt affinity of a
chained interrupt, so it migrates every interrupt on the child interrupt
controller which isn't a good thing to do as pointed out by Russell
King. Thomas pointed out "we should at least make that an opt-in behaviour
and not enabled by default", this patch adds a device tree option for
this purpose. If "migrates every interrupt on the child interrupt controller"
doesn't matter, turning on the option would add irq_set_affinity
support which might be useful in some cases (EG: can save reduce power
consumption on Marvell Berlin SoCs).

A simple test:
~ # rm /tmp/test.sh
~ # cat > /tmp/test.sh
cat /proc/interrupts
for i in `seq 10` ; do sleep $i; done
cat /proc/interrupts
~ # chmod +x /tmp/test.sh
~ # taskset 0x2 /tmp/test.sh

without the patch:

   CPU0   CPU1
 27:115 36   GIC  27  arch_timer
 45: 62  0   GIC  45  mmc0
160: 88  0  interrupt-controller   8  timer
227:  0  0  interrupt-controller   4  f7e81400.i2c
228:  0  0  interrupt-controller   5  f7e81800.i2c
229:  0  0  interrupt-controller   7  dw_spi65535
230:  0  0  interrupt-controller  21  f7e84000.i2c
231:  0  0  interrupt-controller  20  f7e84800.i2c
265:445  0  interrupt-controller   8  serial
IPI0:  0  0  CPU wakeup interrupts
IPI1:  0 11  Timer broadcast interrupts
IPI2: 56104  Rescheduling interrupts
IPI3:  0  0  Function call interrupts
IPI4:  0  4  Single function call interrupts
IPI5:  0  0  CPU stop interrupts
IPI6: 25 27  IRQ work interrupts
IPI7:  0  0  completion interrupts
IPI8:  0  0  CPU backtrace
Err:  0
   CPU0   CPU1
 27:115 38   GIC  27  arch_timer
 45: 62  0   GIC  45  mmc0
160:160  0  interrupt-controller   8  timer
227:  0  0  interrupt-controller   4  f7e81400.i2c
228:  0  0  interrupt-controller   5  f7e81800.i2c
229:  0  0  interrupt-controller   7  dw_spi65535
230:  0  0  interrupt-controller  21  f7e84000.i2c
231:  0  0  interrupt-controller  20  f7e84800.i2c
265:514  0  interrupt-controller   8  serial
IPI0:  0  0  CPU wakeup interrupts
IPI1:  0 83  Timer broadcast interrupts
IPI2: 56104  Rescheduling interrupts
IPI3:  0  0  Function call interrupts
IPI4:  0  4  Single function call interrupts
IPI5:  0  0  CPU stop interrupts
IPI6: 25 46  IRQ work interrupts
IPI7:  0  0  completion interrupts
IPI8:  0  0  CPU backtrace
Err:  0

cpu0 get 160-88=72 timer interrupts, CPU1 got 83-11=72 broadcast timer
IPIs. So, overall system got 72+72=144 wake ups and 72 broadcast timer IPIs

With the patch:
   CPU0   CPU1
 27:107 37   GIC  27  arch_timer
 45: 62  0   GIC  45  mmc0
160: 66  7  interrupt-controller   8  timer
227:  0  0  interrupt-controller   4  f7e81400.i2c
228:  0  0  interrupt-controller   5  f7e81800.i2c
229:  0  0  interrupt-controller   7  dw_spi65535
230:  0  0  interrupt-controller  21  f7e84000.i2c
231:  0  0  interrupt-controller  20  f7e84800.i2c
265:311  0  interrupt-controller   8  serial
IPI0:  0  0  CPU wakeup interrupts
IPI1:  2  4  Timer broadcast interrupts
IPI2: 58100  Rescheduling interrupts
IPI3:  0  0  Function call interrupts
IPI4:  0  4  Single function call interrupts
IPI5:  0  0  CPU stop interrupts
IPI6: 21 24  IRQ work interrupts
IPI7:  0  0  completion interrupts
IPI8:  0  0  CPU backtrace
Err:  0
   CPU0   CPU1
 27:107 39   GIC  27  arch_timer
 45: 62  0   GIC  45  mmc0
160: 69 75  interrupt-controller   8  timer
227:  0  0  interrupt-controller   4  f7e81400.i2c
228:  0  0  interrupt-controller   5  f7e81800.i2c
229:  0  0  interrupt-controller   7  dw_spi65535
230:   

[PATCH v3 2/2] irqchip: dw-apb-ictl: add irq_set_affinity support

2015-07-05 Thread Jisheng Zhang
On Marvell Berlin SoCs, the cpu's local timer is shutdown when the cpu
goes to a deep idle state, then the timer framework will be notified to
use a broadcast timer instead. In our case, the broadcast timer uses
dw-apb-ictl as interrupt chip. This patch adds irq_set_affinity support
so that the going to deep idle state cpu can set the interrupt affinity
of the broadcast interrupt to avoid unnecessary wakeups and IPIs.

NOTE: We achieved this by changing the parent interrupt affinity of a
chained interrupt, so it migrates every interrupt on the child interrupt
controller which isn't a good thing to do as pointed out by Russell
King. Thomas pointed out we should at least make that an opt-in behaviour
and not enabled by default, this patch adds a device tree option for
this purpose. If migrates every interrupt on the child interrupt controller
doesn't matter, turning on the option would add irq_set_affinity
support which might be useful in some cases (EG: can save reduce power
consumption on Marvell Berlin SoCs).

A simple test:
~ # rm /tmp/test.sh
~ # cat  /tmp/test.sh
cat /proc/interrupts
for i in `seq 10` ; do sleep $i; done
cat /proc/interrupts
~ # chmod +x /tmp/test.sh
~ # taskset 0x2 /tmp/test.sh

without the patch:

   CPU0   CPU1
 27:115 36   GIC  27  arch_timer
 45: 62  0   GIC  45  mmc0
160: 88  0  interrupt-controller   8  timer
227:  0  0  interrupt-controller   4  f7e81400.i2c
228:  0  0  interrupt-controller   5  f7e81800.i2c
229:  0  0  interrupt-controller   7  dw_spi65535
230:  0  0  interrupt-controller  21  f7e84000.i2c
231:  0  0  interrupt-controller  20  f7e84800.i2c
265:445  0  interrupt-controller   8  serial
IPI0:  0  0  CPU wakeup interrupts
IPI1:  0 11  Timer broadcast interrupts
IPI2: 56104  Rescheduling interrupts
IPI3:  0  0  Function call interrupts
IPI4:  0  4  Single function call interrupts
IPI5:  0  0  CPU stop interrupts
IPI6: 25 27  IRQ work interrupts
IPI7:  0  0  completion interrupts
IPI8:  0  0  CPU backtrace
Err:  0
   CPU0   CPU1
 27:115 38   GIC  27  arch_timer
 45: 62  0   GIC  45  mmc0
160:160  0  interrupt-controller   8  timer
227:  0  0  interrupt-controller   4  f7e81400.i2c
228:  0  0  interrupt-controller   5  f7e81800.i2c
229:  0  0  interrupt-controller   7  dw_spi65535
230:  0  0  interrupt-controller  21  f7e84000.i2c
231:  0  0  interrupt-controller  20  f7e84800.i2c
265:514  0  interrupt-controller   8  serial
IPI0:  0  0  CPU wakeup interrupts
IPI1:  0 83  Timer broadcast interrupts
IPI2: 56104  Rescheduling interrupts
IPI3:  0  0  Function call interrupts
IPI4:  0  4  Single function call interrupts
IPI5:  0  0  CPU stop interrupts
IPI6: 25 46  IRQ work interrupts
IPI7:  0  0  completion interrupts
IPI8:  0  0  CPU backtrace
Err:  0

cpu0 get 160-88=72 timer interrupts, CPU1 got 83-11=72 broadcast timer
IPIs. So, overall system got 72+72=144 wake ups and 72 broadcast timer IPIs

With the patch:
   CPU0   CPU1
 27:107 37   GIC  27  arch_timer
 45: 62  0   GIC  45  mmc0
160: 66  7  interrupt-controller   8  timer
227:  0  0  interrupt-controller   4  f7e81400.i2c
228:  0  0  interrupt-controller   5  f7e81800.i2c
229:  0  0  interrupt-controller   7  dw_spi65535
230:  0  0  interrupt-controller  21  f7e84000.i2c
231:  0  0  interrupt-controller  20  f7e84800.i2c
265:311  0  interrupt-controller   8  serial
IPI0:  0  0  CPU wakeup interrupts
IPI1:  2  4  Timer broadcast interrupts
IPI2: 58100  Rescheduling interrupts
IPI3:  0  0  Function call interrupts
IPI4:  0  4  Single function call interrupts
IPI5:  0  0  CPU stop interrupts
IPI6: 21 24  IRQ work interrupts
IPI7:  0  0  completion interrupts
IPI8:  0  0  CPU backtrace
Err:  0
   CPU0   CPU1
 27:107 39   GIC  27  arch_timer
 45: 62  0   GIC  45  mmc0
160: 69 75  interrupt-controller   8  timer
227:  0  0  interrupt-controller   4  f7e81400.i2c
228:  0  0  interrupt-controller   5  f7e81800.i2c
229:  0  0  interrupt-controller   7  dw_spi65535
230: