Hi, I'm still having fun with microbenchmarks. I disabled Power States (pstate) of my Intel CPU and forced the frequency for 3.4 GHz. I isolated 2 physical cores on a total of 4. Timings are very stable *but* sometimes, I get impressive slowdown: like 60% or 80% slower, but only for a short time.
Do you know which CPU feature can explain such temporary slowdown? I tried cpupower & powertop tools to try to learn more about internal CPU states, but I don't see anything obvious. I also noticed that powertop has a major side effect: it changes the speed of my CPU cores! Since the CPU cores used to run benchmarks are isolated, powertop uses a low speed (like 1.6 GHz, half speed) while benchmarks are running, probably because the kernel doesn't "see" the benchmark processes. My CPU model is: Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz I'm using "userspace" scaling governor for isolated CPU cores, but "ondemand" for other CPU cores. I disabled pstate (kernel parameter: intel_pstate=disable), the CPU scaling driver is "acpi-cpufreq". CPUs 2,3,6,7 are isolated. In the following examples, the same microbenchmark takes ~196 ms on all cores, except of the core 3 on the first example. Example 1: --- $ for cpu in $(seq 0 7); do echo "=== CPU $cpu ==="; PYTHONHASHSEED=0 taskset -c $cpu ../fastcall/pgo/python performance/bm_call_simple.py -n 1 --timer perf_counter; done === CPU 0 === 0.19619656700160704 === CPU 1 === 0.19547197800056892 === CPU 2 === 0.19512042699716403 === CPU 3 === 0.35738898099953076 === CPU 4 === 0.19744606299718725 === CPU 5 === 0.195480646998476 === CPU 6 === 0.19495172200186062 === CPU 7 === 0.19495161599843414 --- Example 2: --- $ for cpu in $(seq 0 7); do echo "=== CPU $cpu ==="; PYTHONHASHSEED=0 taskset -c $cpu ../fastcall/pgo/python performance/bm_call_simple.py -n 1 --timer perf_counter; done === CPU 0 === 0.19725238799946965 === CPU 1 === 0.19552089699936914 === CPU 2 === 0.19495758999983082 === CPU 3 === 0.19517506799820694 === CPU 4 === 0.1963375539999106 === CPU 5 === 0.19575440099652042 === CPU 6 === 0.19582506000006106 === CPU 7 === 0.19503543600148987 --- If I repeat the same test, timings are always ~196 ms on all cores. It looks like some cores decide to sleep. Victor _______________________________________________ Speed mailing list [email protected] https://mail.python.org/mailman/listinfo/speed
