Bear in mind that what you see by way of CPU Speed is based on *sampling*, and the CPU can be switched speeds very quickly. Far faster than you'd necessarily see in your periodic updates. Also note that if your cooling isn't up to scratch for handling the CPU running permanently at its top normal speed, thermal throttling will cause the system to slow down independently of anything happening OS side. That's embedded within the chip and can't be disabled.
FWIW microbenchmarks are inherently unstable and susceptible to jitter on the system side. There's all sorts of things that could be interfering outside the scope of your tests, and because the benchmark is over and done with so quickly, if something does happen it's going to skew the entire benchmark run. If microbenchmarking really is the right thing for your needs, you should look at running enough runs to be able to get a fair idea of realistic performance. Think hundreds etc, then eliminating particularly fast and/or slow runs from your consideration, and whatever other things you might consider for statistical significance. I do have some concerns that you're increasingly creating a synthetic environment to benchmark against, and that you're at risk of optimising towards an environment the code won't actually run in, and might even end up pursuing the wrong optimisations. Paul On Tue, May 17, 2016 at 11:21:29PM +0200, Victor Stinner wrote: > According to a friend, my CPU model "Intel(R) Core(TM) i7-2600 CPU @ > 3.40GHz" has a "Turbo Mode" which is enabled by default. The CPU tries > to use the Turbo Mode whener possible, but disables it when the CPU is > too hot. The change should be visible with the exact CPU frequency > (the change can be a single MHz: 3400 => 3401). I didn't notice such > minor CPU frequency change, but I didn't check carefully. > > Anyway, I disabled the Turbo Mode and Hyperthreading in the EFI. It > should avoid the strange performance "drop". > > Victor > > 2016-05-17 16:44 GMT+02:00 Victor Stinner <victor.stin...@gmail.com>: > > Hi, > > > > I'm still having fun with microbenchmarks. I disabled Power States > > (pstate) of my Intel CPU and forced the frequency for 3.4 GHz. I > > isolated 2 physical cores on a total of 4. Timings are very stable > > *but* sometimes, I get impressive slowdown: like 60% or 80% slower, > > but only for a short time. > > > > Do you know which CPU feature can explain such temporary slowdown? > > > > I tried cpupower & powertop tools to try to learn more about internal > > CPU states, but I don't see anything obvious. I also noticed that > > powertop has a major side effect: it changes the speed of my CPU > > cores! Since the CPU cores used to run benchmarks are isolated, > > powertop uses a low speed (like 1.6 GHz, half speed) while benchmarks > > are running, probably because the kernel doesn't "see" the benchmark > > processes. > > > > My CPU model is: Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz > > > > I'm using "userspace" scaling governor for isolated CPU cores, but > > "ondemand" for other CPU cores. > > > > I disabled pstate (kernel parameter: intel_pstate=disable), the CPU > > scaling driver is "acpi-cpufreq". > > > > CPUs 2,3,6,7 are isolated. > > > > In the following examples, the same microbenchmark takes ~196 ms on > > all cores, except of the core 3 on the first example. > > > > Example 1: > > --- > > $ for cpu in $(seq 0 7); do echo "=== CPU $cpu ==="; PYTHONHASHSEED=0 > > taskset -c $cpu ../fastcall/pgo/python performance/bm_call_simple.py > > -n 1 --timer perf_counter; done > > === CPU 0 === > > 0.19619656700160704 > > === CPU 1 === > > 0.19547197800056892 > > === CPU 2 === > > 0.19512042699716403 > > === CPU 3 === > > 0.35738898099953076 > > === CPU 4 === > > 0.19744606299718725 > > === CPU 5 === > > 0.195480646998476 > > === CPU 6 === > > 0.19495172200186062 > > === CPU 7 === > > 0.19495161599843414 > > --- > > > > Example 2: > > --- > > $ for cpu in $(seq 0 7); do echo "=== CPU $cpu ==="; PYTHONHASHSEED=0 > > taskset -c $cpu ../fastcall/pgo/python performance/bm_call_simple.py > > -n 1 --timer perf_counter; done > > === CPU 0 === > > 0.19725238799946965 > > === CPU 1 === > > 0.19552089699936914 > > === CPU 2 === > > 0.19495758999983082 > > === CPU 3 === > > 0.19517506799820694 > > === CPU 4 === > > 0.1963375539999106 > > === CPU 5 === > > 0.19575440099652042 > > === CPU 6 === > > 0.19582506000006106 > > === CPU 7 === > > 0.19503543600148987 > > --- > > > > If I repeat the same test, timings are always ~196 ms on all cores. > > > > It looks like some cores decide to sleep. > > > > Victor > _______________________________________________ > Speed mailing list > Speed@python.org > https://mail.python.org/mailman/listinfo/speed _______________________________________________ Speed mailing list Speed@python.org https://mail.python.org/mailman/listinfo/speed