The sched_mc feature has been originally designed to improve power
consumption of multi-package system and several architecture functions
are available to tune the topology and the scheduler's parameters when
scheduler rebuilds the sched_domain hierarchy (change the
sched_mc_power_savings level). This patches series is a trial to
improve the power consumption of dual and quad cortex-A9 when the
sched_mc_power_savings is set to 2. The following patch's policy is to
accept up to 4 threads (can be configured) in the run queue of a core
before starting to load balance if cpu runs at low frequencies but to
accept only 1 thread for high frequencies, which is the normal
behaviour. The goal is to use only one cpu in light load situation and
both cpu in heavy load situation
Patches [1-3] modify the ARM cpu topology according to
sched_mc_power_savings value and Cortex id
Patch [4] enables ARCH_POWER feature of the scheduler
Patch [5] adds ARCH_POWER function for ARM platform
Patches [6-7] modify the cpu_power of CA-9 according to
sched_mc_power_savings' level and current frequency. The main goal is
to increase the capacity of a core when using low cpu frequency in
order to pull tasks on this core. Note that this behaviour is not
really advised but it can be seen as an intermediate step between the
use of cpu hotplug (which is not a power saving feature) and a new
load balancer which will take into account low load situation on dual
core.
Patch [8] ensures that cpu0 is used in priority when only one CPU is running
Patch [9] adds some debugfs interface for test purpose
Patch [10] ensures that the cpu_power will be update periodically
Patch [11] fixes an issue around the trigger of ilb.
TODO list:
-remove useless start of ilb when the core has capacity.
-add a method (DT, sysfs, ...) to set threshold for using 1 or 2 cpus
for dual CA-9
-irq balancing
The tests hereafter have been done on a u8500 with kernel linaro-3.1.
They check that there is no obvious lost of performance when
sched_mc=2.
sysbench --test=cpu --num-threads=12 --max-time=20 run
Test execution summary: sched_mc=0 sched_mc=2 cpu hotplug
total number of events: 665 664
336
per-request statistics:
min: 92.68ms 70.53ms 618.89ms
avg: 361.75ms 361.38ms 725.29ms
max: 421.08ms 420.73ms 840.74ms
approx. 95 percentile: 402.28ms 390.53ms 760.17ms
sysbench --test=threads --thread-locks=9 --num-threads=12 --max-time=20 run
Test execution summary: sched_mc=0 sched_mc=2 cpu hotplug
total number of events: 10000 10000 3129
per-request statistics:
min: 1.62ms 1.70ms
13.16ms
avg: 22.23ms 21.87ms
76.77ms
max: 153.52ms 133.99ms 173.82ms
approx. 95 percentile: 54.12ms 52.65ms 136.32ms
sysbench --test=threads --thread-locks=2 --num-threads=3 --max-time=20 run
Test execution summary: sched_mc=0 sched_mc=2 cpu hotplug
total number of events: 10000 10000 10000
per-request statistics:
min: 1.38ms 1.38ms
5.70ms
avg: 4.67ms 5.37ms
11.85ms
max: 36.84ms 32.42ms 32.58ms
approx. 95 percentile: 14.34ms 12.89ms 21.30ms
cyclictest -q -t -D 20
Only one cpu is used during this test when sched_mc=2 whereas both cpu
are used when sched_mc=0
Test execution summary: sched_mc=0 sched_mc=2 cpu hotplug
Avg, Max: 15, 434 19, 2145 17, 3556
Avg, Max: 14, 104 19, 1686 17, 3593
Regards,
Vincent
_______________________________________________
linaro-dev mailing list
[email protected]
http://lists.linaro.org/mailman/listinfo/linaro-dev