Re: [PATCH v3 2/2] irqchip: dw-apb-ictl: add irq_set_affinity support
Dear Thomas, On Mon, 6 Jul 2015 15:51:28 +0200 Thomas Gleixner wrote: > On Mon, 6 Jul 2015, Jisheng Zhang wrote: > > On Mon, 6 Jul 2015 12:30:01 +0200 > > Thomas Gleixner wrote: > > > > > On Mon, 6 Jul 2015, Jisheng Zhang wrote: > > > > +static int dw_apb_ictl_set_affinity(struct irq_data *d, > > > > + const struct cpumask *mask_val, > > > > + bool force) > > > > +{ > > > > + struct irq_chip_generic *gc = irq_data_get_irq_chip_data(d); > > > > + struct dw_apb_ictl_priv *priv = gc->private; > > > > + struct irq_chip *chip = irq_get_chip(priv->parent_irq); > > > > + struct irq_data *data = irq_get_irq_data(priv->parent_irq); > > > > + > > > > + if (chip && chip->irq_set_affinity) > > > > + return chip->irq_set_affinity(data, mask_val, force); > > > > > > This is wrong as it lacks proper locking of the parent irq. That needs > > > to be solved at the core code level in a clean way. > > > > Is it acceptable to call irq_set_affinity() or irq_force_affinity() as the > > following: > > > > if (force) > > return irq_force_affinity(priv->parent_irq, mask_val); > > else > > return irq_set_affinity(priv->parent_irq, mask_val); > > Not from the driver, as you run into lock nesting hell. As I said, > this needs to be solved at the core code level and needs a proper > thought out design. Got it. Thanks for the clarification. > > Just for the record: I'm not too happy about that 'fiddle with the > parent' mechanism because it opens just a large can of worms. I wish > hardware designers would talk to OS people before they implement random > nonsense. > Fully agree with you. I'm requesting our HW people to connect timer to GIC directly in future chips. But in existing chips, it seems we have to wait for core code ready or use this lack of proper locking set_affinity patch ourself. Thanks, Jisheng -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 2/2] irqchip: dw-apb-ictl: add irq_set_affinity support
On Mon, 6 Jul 2015, Jisheng Zhang wrote: > On Mon, 6 Jul 2015 12:30:01 +0200 > Thomas Gleixner wrote: > > > On Mon, 6 Jul 2015, Jisheng Zhang wrote: > > > +static int dw_apb_ictl_set_affinity(struct irq_data *d, > > > + const struct cpumask *mask_val, > > > + bool force) > > > +{ > > > + struct irq_chip_generic *gc = irq_data_get_irq_chip_data(d); > > > + struct dw_apb_ictl_priv *priv = gc->private; > > > + struct irq_chip *chip = irq_get_chip(priv->parent_irq); > > > + struct irq_data *data = irq_get_irq_data(priv->parent_irq); > > > + > > > + if (chip && chip->irq_set_affinity) > > > + return chip->irq_set_affinity(data, mask_val, force); > > > > This is wrong as it lacks proper locking of the parent irq. That needs > > to be solved at the core code level in a clean way. > > Is it acceptable to call irq_set_affinity() or irq_force_affinity() as the > following: > > if (force) > return irq_force_affinity(priv->parent_irq, mask_val); > else > return irq_set_affinity(priv->parent_irq, mask_val); Not from the driver, as you run into lock nesting hell. As I said, this needs to be solved at the core code level and needs a proper thought out design. Just for the record: I'm not too happy about that 'fiddle with the parent' mechanism because it opens just a large can of worms. I wish hardware designers would talk to OS people before they implement random nonsense. Thanks, tglx -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 2/2] irqchip: dw-apb-ictl: add irq_set_affinity support
On Mon, 6 Jul 2015 12:30:01 +0200 Thomas Gleixner wrote: > On Mon, 6 Jul 2015, Jisheng Zhang wrote: > > +static int dw_apb_ictl_set_affinity(struct irq_data *d, > > + const struct cpumask *mask_val, > > + bool force) > > +{ > > + struct irq_chip_generic *gc = irq_data_get_irq_chip_data(d); > > + struct dw_apb_ictl_priv *priv = gc->private; > > + struct irq_chip *chip = irq_get_chip(priv->parent_irq); > > + struct irq_data *data = irq_get_irq_data(priv->parent_irq); > > + > > + if (chip && chip->irq_set_affinity) > > + return chip->irq_set_affinity(data, mask_val, force); > > This is wrong as it lacks proper locking of the parent irq. That needs > to be solved at the core code level in a clean way. Is it acceptable to call irq_set_affinity() or irq_force_affinity() as the following: if (force) return irq_force_affinity(priv->parent_irq, mask_val); else return irq_set_affinity(priv->parent_irq, mask_val); Thanks, Jisheng -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 2/2] irqchip: dw-apb-ictl: add irq_set_affinity support
On Mon, 6 Jul 2015, Jisheng Zhang wrote: > +static int dw_apb_ictl_set_affinity(struct irq_data *d, > + const struct cpumask *mask_val, > + bool force) > +{ > + struct irq_chip_generic *gc = irq_data_get_irq_chip_data(d); > + struct dw_apb_ictl_priv *priv = gc->private; > + struct irq_chip *chip = irq_get_chip(priv->parent_irq); > + struct irq_data *data = irq_get_irq_data(priv->parent_irq); > + > + if (chip && chip->irq_set_affinity) > + return chip->irq_set_affinity(data, mask_val, force); This is wrong as it lacks proper locking of the parent irq. That needs to be solved at the core code level in a clean way. Thanks, tglx -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 2/2] irqchip: dw-apb-ictl: add irq_set_affinity support
On Mon, 6 Jul 2015, Jisheng Zhang wrote: +static int dw_apb_ictl_set_affinity(struct irq_data *d, + const struct cpumask *mask_val, + bool force) +{ + struct irq_chip_generic *gc = irq_data_get_irq_chip_data(d); + struct dw_apb_ictl_priv *priv = gc-private; + struct irq_chip *chip = irq_get_chip(priv-parent_irq); + struct irq_data *data = irq_get_irq_data(priv-parent_irq); + + if (chip chip-irq_set_affinity) + return chip-irq_set_affinity(data, mask_val, force); This is wrong as it lacks proper locking of the parent irq. That needs to be solved at the core code level in a clean way. Thanks, tglx -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 2/2] irqchip: dw-apb-ictl: add irq_set_affinity support
On Mon, 6 Jul 2015 12:30:01 +0200 Thomas Gleixner t...@linutronix.de wrote: On Mon, 6 Jul 2015, Jisheng Zhang wrote: +static int dw_apb_ictl_set_affinity(struct irq_data *d, + const struct cpumask *mask_val, + bool force) +{ + struct irq_chip_generic *gc = irq_data_get_irq_chip_data(d); + struct dw_apb_ictl_priv *priv = gc-private; + struct irq_chip *chip = irq_get_chip(priv-parent_irq); + struct irq_data *data = irq_get_irq_data(priv-parent_irq); + + if (chip chip-irq_set_affinity) + return chip-irq_set_affinity(data, mask_val, force); This is wrong as it lacks proper locking of the parent irq. That needs to be solved at the core code level in a clean way. Is it acceptable to call irq_set_affinity() or irq_force_affinity() as the following: if (force) return irq_force_affinity(priv-parent_irq, mask_val); else return irq_set_affinity(priv-parent_irq, mask_val); Thanks, Jisheng -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 2/2] irqchip: dw-apb-ictl: add irq_set_affinity support
On Mon, 6 Jul 2015, Jisheng Zhang wrote: On Mon, 6 Jul 2015 12:30:01 +0200 Thomas Gleixner t...@linutronix.de wrote: On Mon, 6 Jul 2015, Jisheng Zhang wrote: +static int dw_apb_ictl_set_affinity(struct irq_data *d, + const struct cpumask *mask_val, + bool force) +{ + struct irq_chip_generic *gc = irq_data_get_irq_chip_data(d); + struct dw_apb_ictl_priv *priv = gc-private; + struct irq_chip *chip = irq_get_chip(priv-parent_irq); + struct irq_data *data = irq_get_irq_data(priv-parent_irq); + + if (chip chip-irq_set_affinity) + return chip-irq_set_affinity(data, mask_val, force); This is wrong as it lacks proper locking of the parent irq. That needs to be solved at the core code level in a clean way. Is it acceptable to call irq_set_affinity() or irq_force_affinity() as the following: if (force) return irq_force_affinity(priv-parent_irq, mask_val); else return irq_set_affinity(priv-parent_irq, mask_val); Not from the driver, as you run into lock nesting hell. As I said, this needs to be solved at the core code level and needs a proper thought out design. Just for the record: I'm not too happy about that 'fiddle with the parent' mechanism because it opens just a large can of worms. I wish hardware designers would talk to OS people before they implement random nonsense. Thanks, tglx -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 2/2] irqchip: dw-apb-ictl: add irq_set_affinity support
Dear Thomas, On Mon, 6 Jul 2015 15:51:28 +0200 Thomas Gleixner t...@linutronix.de wrote: On Mon, 6 Jul 2015, Jisheng Zhang wrote: On Mon, 6 Jul 2015 12:30:01 +0200 Thomas Gleixner t...@linutronix.de wrote: On Mon, 6 Jul 2015, Jisheng Zhang wrote: +static int dw_apb_ictl_set_affinity(struct irq_data *d, + const struct cpumask *mask_val, + bool force) +{ + struct irq_chip_generic *gc = irq_data_get_irq_chip_data(d); + struct dw_apb_ictl_priv *priv = gc-private; + struct irq_chip *chip = irq_get_chip(priv-parent_irq); + struct irq_data *data = irq_get_irq_data(priv-parent_irq); + + if (chip chip-irq_set_affinity) + return chip-irq_set_affinity(data, mask_val, force); This is wrong as it lacks proper locking of the parent irq. That needs to be solved at the core code level in a clean way. Is it acceptable to call irq_set_affinity() or irq_force_affinity() as the following: if (force) return irq_force_affinity(priv-parent_irq, mask_val); else return irq_set_affinity(priv-parent_irq, mask_val); Not from the driver, as you run into lock nesting hell. As I said, this needs to be solved at the core code level and needs a proper thought out design. Got it. Thanks for the clarification. Just for the record: I'm not too happy about that 'fiddle with the parent' mechanism because it opens just a large can of worms. I wish hardware designers would talk to OS people before they implement random nonsense. Fully agree with you. I'm requesting our HW people to connect timer to GIC directly in future chips. But in existing chips, it seems we have to wait for core code ready or use this lack of proper locking set_affinity patch ourself. Thanks, Jisheng -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v3 2/2] irqchip: dw-apb-ictl: add irq_set_affinity support
On Marvell Berlin SoCs, the cpu's local timer is shutdown when the cpu goes to a deep idle state, then the timer framework will be notified to use a broadcast timer instead. In our case, the broadcast timer uses dw-apb-ictl as interrupt chip. This patch adds irq_set_affinity support so that the going to deep idle state cpu can set the interrupt affinity of the broadcast interrupt to avoid unnecessary wakeups and IPIs. NOTE: We achieved this by changing the parent interrupt affinity of a chained interrupt, so it migrates every interrupt on the child interrupt controller which isn't a good thing to do as pointed out by Russell King. Thomas pointed out "we should at least make that an opt-in behaviour and not enabled by default", this patch adds a device tree option for this purpose. If "migrates every interrupt on the child interrupt controller" doesn't matter, turning on the option would add irq_set_affinity support which might be useful in some cases (EG: can save reduce power consumption on Marvell Berlin SoCs). A simple test: ~ # rm /tmp/test.sh ~ # cat > /tmp/test.sh cat /proc/interrupts for i in `seq 10` ; do sleep $i; done cat /proc/interrupts ~ # chmod +x /tmp/test.sh ~ # taskset 0x2 /tmp/test.sh without the patch: CPU0 CPU1 27:115 36 GIC 27 arch_timer 45: 62 0 GIC 45 mmc0 160: 88 0 interrupt-controller 8 timer 227: 0 0 interrupt-controller 4 f7e81400.i2c 228: 0 0 interrupt-controller 5 f7e81800.i2c 229: 0 0 interrupt-controller 7 dw_spi65535 230: 0 0 interrupt-controller 21 f7e84000.i2c 231: 0 0 interrupt-controller 20 f7e84800.i2c 265:445 0 interrupt-controller 8 serial IPI0: 0 0 CPU wakeup interrupts IPI1: 0 11 Timer broadcast interrupts IPI2: 56104 Rescheduling interrupts IPI3: 0 0 Function call interrupts IPI4: 0 4 Single function call interrupts IPI5: 0 0 CPU stop interrupts IPI6: 25 27 IRQ work interrupts IPI7: 0 0 completion interrupts IPI8: 0 0 CPU backtrace Err: 0 CPU0 CPU1 27:115 38 GIC 27 arch_timer 45: 62 0 GIC 45 mmc0 160:160 0 interrupt-controller 8 timer 227: 0 0 interrupt-controller 4 f7e81400.i2c 228: 0 0 interrupt-controller 5 f7e81800.i2c 229: 0 0 interrupt-controller 7 dw_spi65535 230: 0 0 interrupt-controller 21 f7e84000.i2c 231: 0 0 interrupt-controller 20 f7e84800.i2c 265:514 0 interrupt-controller 8 serial IPI0: 0 0 CPU wakeup interrupts IPI1: 0 83 Timer broadcast interrupts IPI2: 56104 Rescheduling interrupts IPI3: 0 0 Function call interrupts IPI4: 0 4 Single function call interrupts IPI5: 0 0 CPU stop interrupts IPI6: 25 46 IRQ work interrupts IPI7: 0 0 completion interrupts IPI8: 0 0 CPU backtrace Err: 0 cpu0 get 160-88=72 timer interrupts, CPU1 got 83-11=72 broadcast timer IPIs. So, overall system got 72+72=144 wake ups and 72 broadcast timer IPIs With the patch: CPU0 CPU1 27:107 37 GIC 27 arch_timer 45: 62 0 GIC 45 mmc0 160: 66 7 interrupt-controller 8 timer 227: 0 0 interrupt-controller 4 f7e81400.i2c 228: 0 0 interrupt-controller 5 f7e81800.i2c 229: 0 0 interrupt-controller 7 dw_spi65535 230: 0 0 interrupt-controller 21 f7e84000.i2c 231: 0 0 interrupt-controller 20 f7e84800.i2c 265:311 0 interrupt-controller 8 serial IPI0: 0 0 CPU wakeup interrupts IPI1: 2 4 Timer broadcast interrupts IPI2: 58100 Rescheduling interrupts IPI3: 0 0 Function call interrupts IPI4: 0 4 Single function call interrupts IPI5: 0 0 CPU stop interrupts IPI6: 21 24 IRQ work interrupts IPI7: 0 0 completion interrupts IPI8: 0 0 CPU backtrace Err: 0 CPU0 CPU1 27:107 39 GIC 27 arch_timer 45: 62 0 GIC 45 mmc0 160: 69 75 interrupt-controller 8 timer 227: 0 0 interrupt-controller 4 f7e81400.i2c 228: 0 0 interrupt-controller 5 f7e81800.i2c 229: 0 0 interrupt-controller 7 dw_spi65535 230:
[PATCH v3 2/2] irqchip: dw-apb-ictl: add irq_set_affinity support
On Marvell Berlin SoCs, the cpu's local timer is shutdown when the cpu goes to a deep idle state, then the timer framework will be notified to use a broadcast timer instead. In our case, the broadcast timer uses dw-apb-ictl as interrupt chip. This patch adds irq_set_affinity support so that the going to deep idle state cpu can set the interrupt affinity of the broadcast interrupt to avoid unnecessary wakeups and IPIs. NOTE: We achieved this by changing the parent interrupt affinity of a chained interrupt, so it migrates every interrupt on the child interrupt controller which isn't a good thing to do as pointed out by Russell King. Thomas pointed out we should at least make that an opt-in behaviour and not enabled by default, this patch adds a device tree option for this purpose. If migrates every interrupt on the child interrupt controller doesn't matter, turning on the option would add irq_set_affinity support which might be useful in some cases (EG: can save reduce power consumption on Marvell Berlin SoCs). A simple test: ~ # rm /tmp/test.sh ~ # cat /tmp/test.sh cat /proc/interrupts for i in `seq 10` ; do sleep $i; done cat /proc/interrupts ~ # chmod +x /tmp/test.sh ~ # taskset 0x2 /tmp/test.sh without the patch: CPU0 CPU1 27:115 36 GIC 27 arch_timer 45: 62 0 GIC 45 mmc0 160: 88 0 interrupt-controller 8 timer 227: 0 0 interrupt-controller 4 f7e81400.i2c 228: 0 0 interrupt-controller 5 f7e81800.i2c 229: 0 0 interrupt-controller 7 dw_spi65535 230: 0 0 interrupt-controller 21 f7e84000.i2c 231: 0 0 interrupt-controller 20 f7e84800.i2c 265:445 0 interrupt-controller 8 serial IPI0: 0 0 CPU wakeup interrupts IPI1: 0 11 Timer broadcast interrupts IPI2: 56104 Rescheduling interrupts IPI3: 0 0 Function call interrupts IPI4: 0 4 Single function call interrupts IPI5: 0 0 CPU stop interrupts IPI6: 25 27 IRQ work interrupts IPI7: 0 0 completion interrupts IPI8: 0 0 CPU backtrace Err: 0 CPU0 CPU1 27:115 38 GIC 27 arch_timer 45: 62 0 GIC 45 mmc0 160:160 0 interrupt-controller 8 timer 227: 0 0 interrupt-controller 4 f7e81400.i2c 228: 0 0 interrupt-controller 5 f7e81800.i2c 229: 0 0 interrupt-controller 7 dw_spi65535 230: 0 0 interrupt-controller 21 f7e84000.i2c 231: 0 0 interrupt-controller 20 f7e84800.i2c 265:514 0 interrupt-controller 8 serial IPI0: 0 0 CPU wakeup interrupts IPI1: 0 83 Timer broadcast interrupts IPI2: 56104 Rescheduling interrupts IPI3: 0 0 Function call interrupts IPI4: 0 4 Single function call interrupts IPI5: 0 0 CPU stop interrupts IPI6: 25 46 IRQ work interrupts IPI7: 0 0 completion interrupts IPI8: 0 0 CPU backtrace Err: 0 cpu0 get 160-88=72 timer interrupts, CPU1 got 83-11=72 broadcast timer IPIs. So, overall system got 72+72=144 wake ups and 72 broadcast timer IPIs With the patch: CPU0 CPU1 27:107 37 GIC 27 arch_timer 45: 62 0 GIC 45 mmc0 160: 66 7 interrupt-controller 8 timer 227: 0 0 interrupt-controller 4 f7e81400.i2c 228: 0 0 interrupt-controller 5 f7e81800.i2c 229: 0 0 interrupt-controller 7 dw_spi65535 230: 0 0 interrupt-controller 21 f7e84000.i2c 231: 0 0 interrupt-controller 20 f7e84800.i2c 265:311 0 interrupt-controller 8 serial IPI0: 0 0 CPU wakeup interrupts IPI1: 2 4 Timer broadcast interrupts IPI2: 58100 Rescheduling interrupts IPI3: 0 0 Function call interrupts IPI4: 0 4 Single function call interrupts IPI5: 0 0 CPU stop interrupts IPI6: 21 24 IRQ work interrupts IPI7: 0 0 completion interrupts IPI8: 0 0 CPU backtrace Err: 0 CPU0 CPU1 27:107 39 GIC 27 arch_timer 45: 62 0 GIC 45 mmc0 160: 69 75 interrupt-controller 8 timer 227: 0 0 interrupt-controller 4 f7e81400.i2c 228: 0 0 interrupt-controller 5 f7e81800.i2c 229: 0 0 interrupt-controller 7 dw_spi65535 230: