On Thu, 2013-06-20 at 14:46 -0500, Dave Chiluk wrote: > Running the below testcase shows each process consuming 41-43% of it's > respective cpu while per core idle numbers show 63-65%, a disparity of > roughly 4-8%. Is this a bug, known behaviour, or consequence of the > process being io bound?
All three I suppose. Idle is indeed inflated when softirq load is present. Depends on ACCOUNTING config what exact numbers you see. There are lies, there are damn lies.. and there are statistics. > 1. run sudo taskset -c 0 netserver > 2. run taskset -c 1 netperf -H localhost -l 3600 -t TCP_RR & (start > netperf with priority on cpu1) > 3. run top, press 1 for multiple CPUs to be separated CONFIG_TICK_CPU_ACCOUNTING cpu[23] isolated cgexec -g cpuset:rtcpus netperf.sh 999&sleep 300 && killall -9 top %Cpu2 : 6.8 us, 42.0 sy, 0.0 ni, 42.0 id, 0.0 wa, 0.0 hi, 9.1 si, 0.0 st %Cpu3 : 5.6 us, 43.3 sy, 0.0 ni, 40.0 id, 0.0 wa, 0.0 hi, 11.1 si, 0.0 st ^^^^ PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ P COMMAND 7226 root 20 0 8828 336 192 S 57.6 0.0 2:49.40 3 netserver 100*(2*60+49.4)/300 = 56.4 7225 root 20 0 8824 648 504 R 55.6 0.0 2:46.55 2 netperf 100*(2*60+46.55)/300 = 55.5 Ok, accumulated time ~agrees with %CPU snapshots. cgexec -g cpuset:rtcpus taskset -c 3 schedctl -I pert 5 (pert is self calibrating tsc tight loop perturbation measurement proggy, enters kernel once per 5s period for write. It doesn't care about post period stats processing/output time, but it's running SCHED_IDLE, gets VERY little CPU when competing, so runs more or less only when netserver is idle. Plenty good enough proxy for idle.) ... cgexec -g cpuset:rtcpus netperf.sh 9999 ... pert/s: 81249 >17.94us: 24 min: 0.08 max: 33.89 avg: 8.24 sum/s:669515us overhead:66.95% pert/s: 81151 >18.43us: 25 min: 0.14 max: 37.53 avg: 8.25 sum/s:669505us overhead:66.95% ^^^^^^^^^^^^^^^^^^^^^^^ pert userspace tsc loop gets ~32% ~= idle upper bound, reported = ~40%, disparity ~8%. PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ P COMMAND 23067 root 20 0 8828 340 196 R 57.5 0.0 0:19.15 3 netserver 23040 root 20 0 8208 396 304 R 42.7 0.0 0:35.61 3 pert ^^^^ ~10% disparity. perf record -e irq:softirq* -a -C 3 -- sleep 00 perf report --sort=comm 99.80% netserver 0.20% pert pert does ~zip softirq processing (timer+rcu) and ~zip squat kernel. Repeat. cgexec -g cpuset:rtcpus netperf.sh 3600 pert/s: 80860 >474.34us: 0 min: 0.06 max: 35.26 avg: 8.28 sum/s:669197us overhead:66.92% pert/s: 80897 >429.20us: 0 min: 0.14 max: 37.61 avg: 8.27 sum/s:668673us overhead:66.87% pert/s: 80800 >388.26us: 0 min: 0.14 max: 31.33 avg: 8.26 sum/s:667277us overhead:66.73% %Cpu3 : 36.3 us, 51.5 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 12.1 si, 0.0 st ^^^^ ~agrees with pert PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ P COMMAND 23569 root 20 0 8828 340 196 R 57.2 0.0 0:21.97 3 netserver 23040 root 20 0 8208 396 304 R 42.9 0.0 6:46.20 3 pert ^^^^ pert is VERY nearly 100% userspace one of those numbers is a.. statistic Kills pert... %Cpu3 : 3.4 us, 42.5 sy, 0.0 ni, 41.4 id, 0.1 wa, 0.0 hi, 12.5 si, 0.0 st ^^^ ~agrees that pert's us claim did go away, but wth is up with sy, it dropped ~9% after killing ~100% us proggy. nak PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ P COMMAND 23569 root 20 0 8828 340 196 R 56.6 0.0 2:50.80 3 netserver Yup, adding softirq load turns utilization numbers into.. statistics. Pure cpu load idle numbers look fine. -Mike -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/