On Fri, Oct 16, 2020 at 4:09 PM Mel Gorman <mgor...@techsingularity.net> wrote: > > On Fri, Oct 16, 2020 at 03:41:12PM +0200, Rafael J. Wysocki wrote: > > > Turns out I didn't even have that. On another machine (same model, > > > same cpu, different BIOS that cannot be updated), enabling the C6 state > > > still did not enable it on boot and dmesg complained about CST not being > > > usable. This is weird because one would expect that if CST was unusable > > > that it would be the same as use_acpi == false. > > > > > > This could potentially be if the ACPI tables are unsuitable due to bad > > > bad FFH information for a lower c-state. If _CST is not found or usable, > > > should acpi_state_table.count be reset to go back to the old behaviour? > > > > Yes, it should, although I would reset it in intel_idle_cst_usable() > > right away before returning 'false'. > > > > Good stuff. > > > I can send a patch to do the above or please submit the one below as > > it works too. > > > > I'm happy to go with your alternative if you prefer. For a finish, > I decided it was worth reporting if the _CST was ignored regardless of > the reason. It performs roughly the same as setting use_acpi = false on > the affected machines. > > ---8<--- > intel_idle: Ignore _CST if control cannot be taken from the platform > > e6d4f08a6776 ("intel_idle: Use ACPI _CST on server systems") avoids > enabling c-states that have been disabled by the platform with the > exception of C1E. > > Unfortunately, BIOS implementations are not always consistent in terms > of how capabilities are advertised and control cannot always be handed > over. If control cannot be handed over then intel_idle reports that "ACPI > _CST not found or not usable" but does not clear acpi_state_table.count > meaning the information is still partially used. > > This patch ignores ACPI information if CST control cannot be requested from > the platform. This was only observed on a number of Haswell platforms that > had identical CPUs but not identical BIOS versions. While this problem > may be rare overall, 24 separate test cases bisected to this specific > commit across 4 separate test machines and is worth addressing. If the > situation occurs, the kernel behaves as it did before commit e6d4f08a6776 > and uses any c-states that are discovered. > > The affected test cases were all ones that involved a small number of > processes -- exec microbenchmark, pipe microbenchmark, git test suite, > netperf, tbench with one client and system call microbenchmark. Each > case benefits from being able to use turboboost which is prevented if the > lower c-states are unavailable. This may mask real regressions specific > to older hardware so it is worth addressing. > > C-state status before and after the patch > > 5.9.0-vanilla POLL latency:0 disabled:0 default:enabled > 5.9.0-vanilla C1 latency:2 disabled:0 default:enabled > 5.9.0-vanilla C1E latency:10 disabled:0 default:enabled > 5.9.0-vanilla C3 latency:33 disabled:1 default:disabled > 5.9.0-vanilla C6 latency:133 disabled:1 default:disabled > 5.9.0-ignore-cst-v1r1 POLL latency:0 disabled:0 default:enabled > 5.9.0-ignore-cst-v1r1 C1 latency:2 disabled:0 default:enabled > 5.9.0-ignore-cst-v1r1 C1E latency:10 disabled:0 default:enabled > 5.9.0-ignore-cst-v1r1 C3 latency:33 disabled:0 default:enabled > 5.9.0-ignore-cst-v1r1 C6 latency:133 disabled:0 default:enabled > > Patch enables C3/C6. > > Netperf UDP_STREAM > > netperf-udp > 5.5.0 5.9.0 > vanilla ignore-cst-v1r1 > Hmean send-64 193.41 ( 0.00%) 226.54 * 17.13%* > Hmean send-128 392.16 ( 0.00%) 450.54 * 14.89%* > Hmean send-256 769.94 ( 0.00%) 881.85 * 14.53%* > Hmean send-1024 2994.21 ( 0.00%) 3468.95 * 15.85%* > Hmean send-2048 5725.60 ( 0.00%) 6628.99 * 15.78%* > Hmean send-3312 8468.36 ( 0.00%) 10288.02 * 21.49%* > Hmean send-4096 10135.46 ( 0.00%) 12387.57 * 22.22%* > Hmean send-8192 17142.07 ( 0.00%) 19748.11 * 15.20%* > Hmean send-16384 28539.71 ( 0.00%) 30084.45 * 5.41%* > > Fixes: e6d4f08a6776 ("intel_idle: Use ACPI _CST on server systems") > Signed-off-by: Mel Gorman <mgor...@techsingularity.net> > --- > drivers/idle/intel_idle.c | 7 +++---- > 1 file changed, 3 insertions(+), 4 deletions(-) > > diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c > index 9a810e4a7946..4af2d3f2c8aa 100644 > --- a/drivers/idle/intel_idle.c > +++ b/drivers/idle/intel_idle.c > @@ -1212,14 +1212,13 @@ static bool __init intel_idle_acpi_cst_extract(void) > if (!intel_idle_cst_usable()) > continue; > > - if (!acpi_processor_claim_cst_control()) { > - acpi_state_table.count = 0; > - return false; > - } > + if (!acpi_processor_claim_cst_control()) > + break; > > return true; > } > > + acpi_state_table.count = 0; > pr_debug("ACPI _CST not found or not usable\n"); > return false; > }
Applied as a fix for 5.10-rc1, thanks!