Bear in mind that what you see by way of CPU Speed is based on *sampling*, and 
the CPU can be switched speeds very
quickly.  Far faster than you'd necessarily see in your periodic updates.  Also 
note that if your cooling isn't up to
scratch for handling the CPU running permanently at its top normal speed, 
thermal throttling will cause the system to
slow down independently of anything happening OS side.  That's embedded within 
the chip and can't be disabled.

FWIW microbenchmarks are inherently unstable and susceptible to jitter on the 
system side.  There's all sorts of things
that could be interfering outside the scope of your tests, and because the 
benchmark is over and done with so quickly,
if something does happen it's going to skew the entire benchmark run. If 
microbenchmarking really is the right thing for
your needs, you should look at running enough runs to be able to get a fair 
idea of realistic performance.  Think
hundreds etc, then eliminating particularly fast and/or slow runs from your 
consideration, and whatever other things you
might consider for statistical significance.

I do have some concerns that you're increasingly creating a synthetic 
environment to benchmark against, and that you're
at risk of optimising towards an environment the code won't actually run in, 
and might even end up pursuing the wrong
optimisations.

Paul

On Tue, May 17, 2016 at 11:21:29PM +0200, Victor Stinner wrote:
> According to a friend, my CPU model "Intel(R) Core(TM) i7-2600 CPU @
> 3.40GHz" has a "Turbo Mode" which is enabled by default. The CPU tries
> to use the Turbo Mode whener possible, but disables it when the CPU is
> too hot. The change should be visible with the exact CPU frequency
> (the change can be a single MHz: 3400 => 3401). I didn't notice such
> minor CPU frequency change, but I didn't check carefully.
> 
> Anyway, I disabled the Turbo Mode and Hyperthreading in the EFI. It
> should avoid the strange performance "drop".
> 
> Victor
> 
> 2016-05-17 16:44 GMT+02:00 Victor Stinner <victor.stin...@gmail.com>:
> > Hi,
> >
> > I'm still having fun with microbenchmarks. I disabled Power States
> > (pstate) of my Intel CPU and forced the frequency for 3.4 GHz. I
> > isolated 2 physical cores on a total of 4. Timings are very stable
> > *but* sometimes, I get impressive slowdown: like 60% or 80% slower,
> > but only for a short time.
> >
> > Do you know which CPU feature can explain such temporary slowdown?
> >
> > I tried cpupower & powertop tools to try to learn more about internal
> > CPU states, but I don't see anything obvious. I also noticed that
> > powertop has a major side effect: it changes the speed of my CPU
> > cores! Since the CPU cores used to run benchmarks are isolated,
> > powertop uses a low speed (like 1.6 GHz, half speed) while benchmarks
> > are running, probably because the kernel doesn't "see" the benchmark
> > processes.
> >
> > My CPU model is: Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz
> >
> > I'm using "userspace" scaling governor for isolated CPU cores, but
> > "ondemand" for other CPU cores.
> >
> > I disabled pstate (kernel parameter: intel_pstate=disable), the CPU
> > scaling driver is "acpi-cpufreq".
> >
> > CPUs 2,3,6,7 are isolated.
> >
> > In the following examples, the same microbenchmark takes ~196 ms on
> > all cores, except of the core 3 on the first example.
> >
> > Example 1:
> > ---
> > $ for cpu in $(seq 0 7); do echo "=== CPU $cpu ==="; PYTHONHASHSEED=0
> > taskset -c $cpu ../fastcall/pgo/python performance/bm_call_simple.py
> > -n 1 --timer perf_counter; done
> > === CPU 0 ===
> > 0.19619656700160704
> > === CPU 1 ===
> > 0.19547197800056892
> > === CPU 2 ===
> > 0.19512042699716403
> > === CPU 3 ===
> > 0.35738898099953076
> > === CPU 4 ===
> > 0.19744606299718725
> > === CPU 5 ===
> > 0.195480646998476
> > === CPU 6 ===
> > 0.19495172200186062
> > === CPU 7 ===
> > 0.19495161599843414
> > ---
> >
> > Example 2:
> > ---
> > $ for cpu in $(seq 0 7); do echo "=== CPU $cpu ==="; PYTHONHASHSEED=0
> > taskset -c $cpu ../fastcall/pgo/python performance/bm_call_simple.py
> > -n 1 --timer perf_counter; done
> > === CPU 0 ===
> > 0.19725238799946965
> > === CPU 1 ===
> > 0.19552089699936914
> > === CPU 2 ===
> > 0.19495758999983082
> > === CPU 3 ===
> > 0.19517506799820694
> > === CPU 4 ===
> > 0.1963375539999106
> > === CPU 5 ===
> > 0.19575440099652042
> > === CPU 6 ===
> > 0.19582506000006106
> > === CPU 7 ===
> > 0.19503543600148987
> > ---
> >
> > If I repeat the same test, timings are always ~196 ms on all cores.
> >
> > It looks like some cores decide to sleep.
> >
> > Victor
> _______________________________________________
> Speed mailing list
> Speed@python.org
> https://mail.python.org/mailman/listinfo/speed
_______________________________________________
Speed mailing list
Speed@python.org
https://mail.python.org/mailman/listinfo/speed

Reply via email to