Hi,
Solaris programs the HPET at boot time to use an I/O APIC
interrupt targeting CPU 0. When CPUs go into deep C-state,
they schedule their next cbe interrupt on the HPET because
their local APIC will stall in deep C-States. Currently the HPET
always interrupts CPU 0 which in turn sends an IPI to the
CPU that needs to wakeup next.
I have not looked into re-programming the I/O APIC to
target different CPUs such as the next CPU to wakeup.
Here are some thoughts on this:
1. Always program the I/O APIC to target the HPET's interrupt
to the next CPU that should wake up.
Advantage: only one cpu ever has to wake up.
Disadvantage: Programming the I/O APIC whenever the
HPET's timer is programmed may be quite expensive.
For example removing a few HPET reads from its ISR made
a huge performance difference.
2. Program the I/O APIC to target the HPET interrupt to a CPU
on the socket the power-aware scheduler will attempt to make
idle first.
Advantage: HPET is least likely to interrupt "busy" cpus.
Advantage: The next deep C-state cpu to wake up is likely
to be on the same "package" aka "socket". Much of
the benefit of deep C-states is only realized when all CPUs
on a package enter deep C-states, so waking up 2 core on
a socket is not mush worse power-wise than waking just 1.
Advantage: minimizes I/O APIC re-programming.
3. Leave the I/O APIC targetting the HPET interrupt on CPU 0.
Advantage: ? No reliance on PAD for prototyping?
Advantage: least I/O APIC reprogramming in ISR.
Disadvantage: CPU 0 is usually already quite busy.
Option #2 sounds interesting to me. Any thoughts?
I have not yet experimented with I/O APIC re-programming.
I am guessing #1 will be too expensive for the interrupt service
routine.
Regards,
Bill