Did some benchmarks, for the while true; do (:) & (:); done simple example this goes from 215 to 313 iterations/s, and changes sys+user CPU from 152% to 45%
Any long running bash script will tend to exhibit this issue -- On 4/15/15, 5:59 PM, "John Fremlin" <j...@fb.com> wrote: >Over time, a long running bash process with ulimit u set high (e.g. >100k) will gradually use more and more CPU. My last patch does not >actually fix the problem in all cases, just stores fewer things in this >structure. > >This second patch changes the bpgids structure to use a hash table >pointing to a contiguous circular buffer and frees it after fork. > >To see the slowdown and improvement set ulimit -u 30000 (many production >systems have this over 100k) and run something like > >while true; do (:) & (:); done > >And look at top. > >Alternatively, this command shows it clearly > >/usr/bin/time -p bash -c 'start=$(date +%s); end=$(($start+2000)); >now=$start; while test $now -le $end; do count=0; while test $(date +%s) >= $now; do (:) & (:); count=$((count+1)); done; now=$(date +%s); echo >$(($now - $start)) $count; done' > >The output is in both cases real 2000 > >With patch it is dominated by copying page table entries on fork > >user 94.14 >sys 657.74 > >Without patch most time is spent in bgp_* functions > >user 1637.16 >sys 1337.58 > >Number of iterations of this busy loop is much higher with the patch too >:) Here is some benchmark data over 10k seconds (two+ hours) > circular_total real user kernel iterations 1 10000.15 669.06 4227.53 3147086 2 10000.97 620.34 4233.99 3121982 3 10000.98 678.59 4348.51 3111725 4 10000.97 460.06 3479.10 3177468 5 10000.98 488.54 3712.60 3145473 6 10000.98 463.15 3489.42 3174081 7 10000.97 459.57 3489.19 3162874 8 10000.98 482.22 3628.21 3148002 9 10000.98 686.72 4134.55 2949696 10 10000.98 680.75 4254.89 3134325 11 10000.98 678.37 4224.12 3143196 > unpatched_total real user kernel iterations 1 10000.98 8594.61 6584.06 2142822 2 10000.98 8596.48 6565.17 2142218 3 10000.97 8467.03 6567.20 2132762 4 10000.97 8674.81 6574.01 2161381 5 10000.98 8670.63 6560.84 2158351 6 10000.98 8646.22 6555.38 2161951 7 10000.97 8631.14 6563.89 2150517 8 10000.97 8735.51 6525.91 2156905 9 10000.92 8748.48 6472.00 2165005 10 10000.97 8748.66 6498.08 2159130 Note that over time the asymptotic slowdown (as time against iterations) is now 10% of what it was > unpatched_after_start <- unpatched_long[which(unpatched_long$V1 > 2000),] > lm(unpatched_after_start$V2 ~ unpatched_after_start$V1) Call: lm(formula = unpatched_after_start$V2 ~ unpatched_after_start$V1) Coefficients: (Intercept) unpatched_after_start$V1 223.087392 -0.001823 > circular_after_start <- circular_long[which(circular_long$V1 > 2000),] > lm(circular_after_start$V2 ~ circular_after_start$V1) Call: lm(formula = circular_after_start$V2 ~ circular_after_start$V1) Coefficients: (Intercept) circular_after_start$V1 3.068e+02 -1.219e-04