On Tue, 25 Mar 2014 03:23:54 +0530 Rajesh Mallah <mallah.raj...@gmail.com> wrote:
> I also observed that a clone of the rootfs from Mele M3 to another > A20 based TB Box consistently performed slower than Mele M3. > > MeleM3 : 18.95 secs > Other A20: 25 secs > > the dump from a10-meminfo-static were same in both the cases > except for dram_zq param > > Can anyone pls explain why the difference in the A20 based boards > itself ? To profile this use case, we can run the following command: $ DISPLAY=:0 perf record -e cpu-clock -a gtkperf -a This instructed perf to collect statistics for the whole system from all CPU cores while gtkperf is running. Now after we have all the statistics collected, we can check the percentage of CPU usage for different processes: $ perf report -s pid 49.02% gtkperf:19651 30.54% Xorg:19603 18.75% swapper: 0 0.69% kworker/0:1:19569 0.32% xkbcomp:19656 0.31% xkbcomp:19655 0.12% perf:19650 This means that some of the time the CPU cores were idle (swapper). The CPU usage in gtkperf is almost twice higher than in Xorg. You can also run 'perf report' to see the time spent in each individual function (if you have debugging symbols). Now there is indeed one strange thing. If I run 'htop' while gtkperf is running, I can sometimes see that only one CPU core is fully loaded while the other is completely idle. And both gtkperf and Xorg processes are running on the same fully loaded CPU core. As an experiment (on an Allwinner A20 based Cubietruck board), we can try pinning gtkperf and Xorg processes to CPU cores. Start Xorg and pin it to the CPU core 0: # taskset -c 0 Xorg Then run gtkperf pinned to the same CPU 0 core as Xorg: $ DISPLAY=:0 taskset -c 0 gtkperf -a Total time: 26.78 And also pinned to a different CPU 1 core for comparison: $ DISPLAY=:0 taskset -c 1 gtkperf -a Total time: 19.44 When Xorg and gtkperf are running on different CPU cores, the performance is better. Without using taskset to pin processes to CPU cores, gtkperf result is somewhere between these 19.44 and 26.78 times, typically closer to the latter one. It basically looks like the CFS scheduler in the linux-sunxi 3.4.79 kernel is not doing a stellar job for gtkperf. However a similar gtkperf behaviour can be also observed on ARM Chromebook (dual-core Cortex-A15 1.7GHz), when using exactly the same rootfs: Total time: 9.35 (just run gtkperf without any tweaks) Total time: 9.82 (Xorg and gtkperf pinned to the same CPU core) Total time: 7.11 (Xorg and gtkperf pinned to different CPU cores) -- Best regards, Siarhei Siamashka -- You received this message because you are subscribed to the Google Groups "linux-sunxi" group. To unsubscribe from this group and stop receiving emails from it, send an email to linux-sunxi+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.