First of all, I'd like to introduce the major power management stuffs currently available on DragonFly.
- ACPI P-state. It has the proper CPU power domain support. - ACPI C-state. Unlike other BSDs, on relatively recent Intel CPUs (the oldest Intel CPU I tested is Sandy Bridge), we don't use I/O port to enter ACPI C2/C3, instead, we give hint to BIOS that 'native C2/C3' is supported. BIOS will send us ACPI C2/C3 to mwait C-states maps through GAS. And the GAS will also contain information about whether checking bus master status is needed or not for ACPI C3. Given most of recent Intel CPUs (since core2) do not require bus master arbitration or flush cache before entering ACPI C3, entering APIC C-state becomes simple monitor/mwait instructions. - Intel Performance and Energy Bias Hint. According to Intel software developer manual, it's a "hint to guide the hardware heuristic of power management features to favor increasing dynamic performance or conserve energy consumption". - Mwait C-state. This requires mwait extension, which is available on almost all of the recent CPUs. If we need to check bus master status, flush cache or bus master arbitration before entering ACPI C3, then mwait C-states deeper then C2/0 will not be used. ======================== ACPI P-state On Intel i7-3770 3.40GHz and Intel i5-3230M 2.60GHz and many other recent Intel CPUs (tested by dillon@), adjusting ACPI P-state does _not_ reduce power consumption at all. On these CPUs, adjusting ACPI P-state only affects how dynamic frequency works, e.g. on Intel i5-3230M (system is idle): ACPI P-state 2601 (TurboBoost): hw.sensors.cpu0.freq0: 2721627000 Hz (cpu0 freq) hw.sensors.cpu1.freq0: 2751159000 Hz (cpu1 freq) hw.sensors.cpu2.freq0: 2627060000 Hz (cpu2 freq) hw.sensors.cpu3.freq0: 2191103000 Hz (cpu3 freq) ACPI P-state 2600: hw.sensors.cpu0.freq0: 2266678000 Hz (cpu0 freq) hw.sensors.cpu1.freq0: 2401977000 Hz (cpu1 freq) hw.sensors.cpu2.freq0: 2248562000 Hz (cpu2 freq) hw.sensors.cpu3.freq0: 2406333000 Hz (cpu3 freq) ACPI P-state 1200: hw.sensors.cpu0.freq0: 1197281000 Hz (cpu0 freq) hw.sensors.cpu1.freq0: 1197340000 Hz (cpu1 freq) hw.sensors.cpu2.freq0: 1197284000 Hz (cpu2 freq) hw.sensors.cpu3.freq0: 1197300000 Hz (cpu3 freq) On Intel E5-2620 v2 2.10GHz, adjusting ACPI P-state reduces power consumption. However, as far as I tested, the power consumption change is only between TurboBoost ACPI P-state and non-TurboBoost ACPI P-state, e.g. on Intel E5-2620 v2 (2-way): ACPI P-state 2101 (TurboBoost): 94.6w ACPI P-state 2100: 92.5w ACPI P-state 1200: 92.5w ======================== Intel Performance and Energy Bias Hint To be frank, I didn't notice power consumpion or thermal changes by adjusting this hint on any of the Intel CPUs that I tested. ======================== ACPI C-state Since on all of the Intel CPUs I tested ACPI C-states are mapped to mwait C-states and do not require bus master operations; we move on to mwait C-states. ======================== Mwait C-state It seems to be the only power management stuff that reduces power consumption on all Intel CPUs dillon@ and I tested. The power consumption on the CPUs I tested. Intel i7-3770 (ACPI P-state 3401): mwait C1/0: 38.3w mwait C1/1: 38.3w mwait C2/0: 37.6w mwait C3/0: 36.8w Intel i5-3230M (ACPI P-state 2601): mwait C1/0: 14.7w mwait C1/1: 14.7w mwait C2/0: 13.3w mwait C3/0: 12.9w mwait C4/0: 12.9w mwait C4/1: 12.9w Intel E5-2620 v2 (2-way) (ACPI P-state 2101 TurboBoost): mwait C1/0: 94.6w mwait C1/1: 94.6w mwait C2/0: 93.7w mwait C3/0: 92.3w (ACPI P-state 2100~1200): mwait C1/0: 92.5w mwait C1/1: 92.5w mwait C2/0: 85.3w mwait C3/0: 83.8w One thing in common is that there is no power consumption difference between mwait C1/0 (C1) and C1/1 (C1E?); probably because C1E will be entered once all cores are in C1 state, as mentioned in Intel E5-2600 v2 datasheet. Though deep mwait C-states reduce power consumption, you will have to pay for the additional latency. The latency could be as high as 40us if the CPU enters deep package C-state. The average latency I gathered on various types of CPUs I tested (by using debug.ipiq.latency_test): Intel i7-3770 (ACPI P-state 3401): mwait C1/0 and C1/1: 760ns mwait C2/0: 18us mwait C3/0: 25us Intel i5-3230M (ACPI P-state 2601): mwait C1/0 and C1/1: 950ns mwait C2/0: 21us mwait C3/0, C4/0, C4/1: 26us Intel E5-2620 v2 (2-way) (ACPI P-state 2101 TurboBoost) mwait C1/0 and C1/1: same package 2200ns, different package 2600ns mwait C2/0: same package 15us, different package 24us mwait C3/0: same package 33us, different package 37us (ACPI P-state 2100) mwait C1/0 and C1/1: same package 2200ns, different package 2600ns mwait C2/0: same package 15us, different package 22us mwait C3/0: same package 26us, different package 36us NOTE: For Intel E5-2620 v2 (2-way) TurboBoost mode, there is up to 60us latency on the same package mwait C2 and mwait C3 latency test (well, I don't know why). If your application is latency aware, you need to be careful with the deep mwait C-states. -------- There are cases that you could save more power and your system runs faster! The situation I found is that deep mwait C-states could allow loaded CPU to boost to higher frequency. Here is what I saw on Intel E5-2620 v2 (2-way) (ACPI P-state 2101 TurboBoost): make -j 48 -DNO_MODULES buildkernel KERNCONF=LINT64 Force mwait C1/0 on all CPUs. Total time: 182s Power consumption during make depend: 110w CPU frequency during make depend: 2.37GHz Power consumption during full run: 161w CPU frequency during full run: 2.4GHz Force mwait C3/0 on all CPUs. Total time: 180s (2 seconds shorter!) Power consumption during make depend: 106w (4w lower!) CPU frequency during make depend: 2.57GHz (200MHz higher!) Power consumption during full run: 161w (same) CPU frequency during full run: 2.4GHz Best Regards, sephe -- Tomorrow Will Never Die