By default, coccicheck utilizes all available threads to implement
parallelisation. However, when all available threads are used,
a decrease in performance is noted. The elapsed time is  minimum
when at most one thread per core is used.

For example, on benchmarking the semantic patch kfree.cocci for
usb/serial using hyperfine, the outputs obtained for J=5 and J=2
are 1.32 and 1.90 times faster than those for J=10 and J=9
respectively for two separate runs. For the larger drivers/staging
directory, minimium elapsed time is obtained for J=3 which is 1.86
times faster than that for J=12. The optimal J value does not
exceed 6 in any of the test runs. The benchmarks are run on a machine
with 6 cores, with 2 threads per core, i.e, 12 hyperthreads in all.

To improve performance, modify coccicheck to use at most only
one thread per core by default.

Signed-off-by: Sumera Priyadarsini <[email protected]>
---
 scripts/coccicheck | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/scripts/coccicheck b/scripts/coccicheck
index e04d328210ac..dd228dcc915e 100755
--- a/scripts/coccicheck
+++ b/scripts/coccicheck
@@ -75,8 +75,17 @@ else
         OPTIONS="--dir $KBUILD_EXTMOD $COCCIINCLUDE"
     fi
 
+    # Use only one thread per core by default if hyperthreading is enabled
+    THREADS_PER_CORE=$(lscpu | grep "Thread(s) per core: " | tr -cd [:digit:])
     if [ -z "$J" ]; then
         NPROC=$(getconf _NPROCESSORS_ONLN)
+       if [ $THREADS_PER_CORE -gt 1 -a $NPROC -gt 2 ] ; then
+               if [ $NPROC -gt 8 ] ; then
+                       NPROC=$((NPROC/4))
+               else
+                       NPROC=$((NPROC/2))
+               fi
+       fi
     else
         NPROC="$J"
     fi
-- 
2.17.1

Reply via email to