Hi On Wed, Aug 10, 2016 at 01:26:41PM +0200, Ingo Molnar wrote: > Nice detective work! I'm wondering, where do we stand if compared with a > pre-6e998916dfe3 kernel? > > I admit this is a difficult question: 6e998916dfe3 does not revert cleanly > and I > suspect v3.17 does not run easily on a recent distro. Could you attempt to > revert > the bad effects of 6e998916dfe3 perhaps, just to get numbers - i.e. don't try > to > make the result correct, just see what the performance gap is, roughly. > > If there's still a significant gap then it might make sense to optimize this > some > more.
I measured (partial) revert performance on 4.7 using mmtest instructions from Giovanni and also tested some other possible fix (draft version): diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c index 75f98c5..54fdf6d 100644 --- a/kernel/sched/cputime.c +++ b/kernel/sched/cputime.c @@ -294,6 +294,8 @@ void thread_group_cputime(struct task_struct *tsk, struct task_cputime *times) unsigned int seq, nextseq; unsigned long flags; + (void) task_sched_runtime(tsk); + rcu_read_lock(); /* Attempt a lockless read on the first round. */ nextseq = 0; @@ -308,7 +310,7 @@ void thread_group_cputime(struct task_struct *tsk, struct task_cputime *times) task_cputime(t, &utime, &stime); times->utime += utime; times->stime += stime; - times->sum_exec_runtime += task_sched_runtime(t); + times->sum_exec_runtime += t->se.sum_exec_runtime; } /* If lockless access failed, take the lock. */ nextseq = 1; --- mmtest benchmark results are below (full compare-kernels.sh output is in attachment): vanila-4.7 revert prefetch patch 4.74 ( 0.00%) 3.04 ( 35.93%) 4.09 ( 13.81%) 1.30 ( 72.59%) 5.49 ( 0.00%) 5.00 ( 8.97%) 5.34 ( 2.72%) 1.03 ( 81.16%) 6.12 ( 0.00%) 4.91 ( 19.73%) 5.97 ( 2.40%) 0.90 ( 85.27%) 6.68 ( 0.00%) 4.90 ( 26.66%) 6.02 ( 9.75%) 0.88 ( 86.89%) 7.21 ( 0.00%) 5.13 ( 28.85%) 6.70 ( 7.09%) 0.87 ( 87.91%) 7.66 ( 0.00%) 5.22 ( 31.80%) 7.17 ( 6.39%) 0.92 ( 88.01%) 7.91 ( 0.00%) 5.36 ( 32.22%) 7.30 ( 7.72%) 0.95 ( 87.97%) 7.95 ( 0.00%) 5.35 ( 32.73%) 7.34 ( 7.66%) 1.06 ( 86.66%) 8.00 ( 0.00%) 5.33 ( 33.31%) 7.38 ( 7.73%) 1.13 ( 85.82%) 5.61 ( 0.00%) 3.55 ( 36.76%) 4.53 ( 19.23%) 2.29 ( 59.28%) 5.66 ( 0.00%) 4.32 ( 23.79%) 4.75 ( 16.18%) 3.65 ( 35.46%) 5.98 ( 0.00%) 4.97 ( 16.87%) 5.96 ( 0.35%) 3.62 ( 39.40%) 6.58 ( 0.00%) 4.94 ( 24.93%) 6.04 ( 8.32%) 3.63 ( 44.89%) 7.19 ( 0.00%) 5.18 ( 28.01%) 6.68 ( 7.13%) 3.65 ( 49.22%) 7.67 ( 0.00%) 5.27 ( 31.29%) 7.16 ( 6.63%) 3.62 ( 52.76%) 7.88 ( 0.00%) 5.36 ( 31.98%) 7.28 ( 7.58%) 3.65 ( 53.71%) 7.99 ( 0.00%) 5.39 ( 32.52%) 7.40 ( 7.42%) 3.65 ( 54.25%) Patch works because we we update sum_exec_runtime on current thread what assure we see proper sum_exec_runtime value on different CPUs. I tested it with reproducers from commits 6e998916dfe32 and d670ec13178d0, patch did not break them. I'm going to run some other test. Patch is draft version for early review, task_sched_runtime() will be simplified (since it's called only current thread) and possibly split into two functions: one that call update_curr() and other that return sum_exec_runtime (assure it's consistent on 32 bit arches). Stanislaw
poundtime vanilla rever prefetc mas 4.7 revert prefetch mask Min real-pound_clock_gettime-2 4.38 ( 0.00%) 2.73 ( 37.67%) 3.62 ( 17.35%) 1.19 ( 72.83%) Min real-pound_clock_gettime-5 5.40 ( 0.00%) 4.76 ( 11.85%) 4.49 ( 16.85%) 0.99 ( 81.67%) Min real-pound_clock_gettime-8 5.83 ( 0.00%) 4.88 ( 16.30%) 5.91 ( -1.37%) 0.88 ( 84.91%) Min real-pound_clock_gettime-12 6.55 ( 0.00%) 4.87 ( 25.65%) 5.98 ( 8.70%) 0.84 ( 87.18%) Min real-pound_clock_gettime-21 7.11 ( 0.00%) 5.10 ( 28.27%) 6.63 ( 6.75%) 0.85 ( 88.05%) Min real-pound_clock_gettime-30 7.56 ( 0.00%) 5.20 ( 31.22%) 7.08 ( 6.35%) 0.87 ( 88.49%) Min real-pound_clock_gettime-48 7.78 ( 0.00%) 5.24 ( 32.65%) 7.20 ( 7.46%) 0.92 ( 88.17%) Min real-pound_clock_gettime-79 7.89 ( 0.00%) 5.23 ( 33.71%) 7.20 ( 8.75%) 1.00 ( 87.33%) Min real-pound_clock_gettime-96 7.88 ( 0.00%) 5.24 ( 33.50%) 7.29 ( 7.49%) 1.09 ( 86.17%) Min real-pound_times-2 4.87 ( 0.00%) 3.19 ( 34.50%) 4.00 ( 17.86%) 2.06 ( 57.70%) Min real-pound_times-5 5.59 ( 0.00%) 3.91 ( 30.05%) 4.61 ( 17.53%) 3.61 ( 35.42%) Min real-pound_times-8 5.74 ( 0.00%) 4.88 ( 14.98%) 5.80 ( -1.05%) 3.56 ( 37.98%) Min real-pound_times-12 6.44 ( 0.00%) 4.90 ( 23.91%) 6.00 ( 6.83%) 3.52 ( 45.34%) Min real-pound_times-21 7.11 ( 0.00%) 5.11 ( 28.13%) 6.61 ( 7.03%) 3.59 ( 49.51%) Min real-pound_times-30 7.60 ( 0.00%) 5.20 ( 31.58%) 7.03 ( 7.50%) 3.54 ( 53.42%) Min real-pound_times-48 7.80 ( 0.00%) 5.24 ( 32.82%) 7.20 ( 7.69%) 3.61 ( 53.72%) Min real-pound_times-79 7.92 ( 0.00%) 5.24 ( 33.84%) 7.31 ( 7.70%) 3.61 ( 54.42%) Min real-pound_times-96 7.94 ( 0.00%) 5.24 ( 34.01%) 7.29 ( 8.19%) 3.58 ( 54.91%) Min syst-pound_clock_gettime-2 8.54 ( 0.00%) 4.89 ( 42.74%) 6.98 ( 18.27%) 2.16 ( 74.71%) Min syst-pound_clock_gettime-5 26.57 ( 0.00%) 23.29 ( 12.34%) 22.09 ( 16.86%) 4.47 ( 83.18%) Min syst-pound_clock_gettime-8 45.82 ( 0.00%) 38.02 ( 17.02%) 46.61 ( -1.72%) 6.44 ( 85.95%) Min syst-pound_clock_gettime-12 77.23 ( 0.00%) 56.61 ( 26.70%) 69.25 ( 10.33%) 9.34 ( 87.91%) Min syst-pound_clock_gettime-21 147.44 ( 0.00%) 103.97 ( 29.48%) 134.76 ( 8.60%) 15.12 ( 89.74%) Min syst-pound_clock_gettime-30 176.07 ( 0.00%) 117.81 ( 33.09%) 162.77 ( 7.55%) 15.95 ( 90.94%) Min syst-pound_clock_gettime-48 182.93 ( 0.00%) 119.92 ( 34.44%) 168.06 ( 8.13%) 19.82 ( 89.17%) Min syst-pound_clock_gettime-79 186.13 ( 0.00%) 123.31 ( 33.75%) 170.34 ( 8.48%) 22.90 ( 87.70%) Min syst-pound_clock_gettime-96 187.05 ( 0.00%) 124.22 ( 33.59%) 172.67 ( 7.69%) 25.19 ( 86.53%) Min syst-pound_times-2 9.55 ( 0.00%) 6.22 ( 34.87%) 7.80 ( 18.32%) 3.90 ( 59.16%) Min syst-pound_times-5 27.68 ( 0.00%) 19.24 ( 30.49%) 22.76 ( 17.77%) 17.56 ( 36.56%) Min syst-pound_times-8 45.11 ( 0.00%) 38.75 ( 14.10%) 45.15 ( -0.09%) 27.77 ( 38.44%) Min syst-pound_times-12 76.60 ( 0.00%) 56.89 ( 25.73%) 71.06 ( 7.23%) 41.64 ( 45.64%) Min syst-pound_times-21 145.25 ( 0.00%) 102.48 ( 29.45%) 136.15 ( 6.27%) 72.98 ( 49.76%) Min syst-pound_times-30 175.03 ( 0.00%) 118.89 ( 32.07%) 161.32 ( 7.83%) 79.91 ( 54.34%) Min syst-pound_times-48 183.61 ( 0.00%) 121.06 ( 34.07%) 167.26 ( 8.90%) 83.24 ( 54.66%) Min syst-pound_times-79 187.18 ( 0.00%) 123.24 ( 34.16%) 173.22 ( 7.46%) 84.36 ( 54.93%) Min syst-pound_times-96 188.88 ( 0.00%) 124.04 ( 34.33%) 173.52 ( 8.13%) 83.02 ( 56.05%) Amean real-pound_clock_gettime-2 4.74 ( 0.00%) 3.04 ( 35.93%) 4.09 ( 13.81%) 1.30 ( 72.59%) Amean real-pound_clock_gettime-5 5.49 ( 0.00%) 5.00 ( 8.97%) 5.34 ( 2.72%) 1.03 ( 81.16%) Amean real-pound_clock_gettime-8 6.12 ( 0.00%) 4.91 ( 19.73%) 5.97 ( 2.40%) 0.90 ( 85.27%) Amean real-pound_clock_gettime-12 6.68 ( 0.00%) 4.90 ( 26.66%) 6.02 ( 9.75%) 0.88 ( 86.89%) Amean real-pound_clock_gettime-21 7.21 ( 0.00%) 5.13 ( 28.85%) 6.70 ( 7.09%) 0.87 ( 87.91%) Amean real-pound_clock_gettime-30 7.66 ( 0.00%) 5.22 ( 31.80%) 7.17 ( 6.39%) 0.92 ( 88.01%) Amean real-pound_clock_gettime-48 7.91 ( 0.00%) 5.36 ( 32.22%) 7.30 ( 7.72%) 0.95 ( 87.97%) Amean real-pound_clock_gettime-79 7.95 ( 0.00%) 5.35 ( 32.73%) 7.34 ( 7.66%) 1.06 ( 86.66%) Amean real-pound_clock_gettime-96 8.00 ( 0.00%) 5.33 ( 33.31%) 7.38 ( 7.73%) 1.13 ( 85.82%) Amean real-pound_times-2 5.61 ( 0.00%) 3.55 ( 36.76%) 4.53 ( 19.23%) 2.29 ( 59.28%) Amean real-pound_times-5 5.66 ( 0.00%) 4.32 ( 23.79%) 4.75 ( 16.18%) 3.65 ( 35.46%) Amean real-pound_times-8 5.98 ( 0.00%) 4.97 ( 16.87%) 5.96 ( 0.35%) 3.62 ( 39.40%) Amean real-pound_times-12 6.58 ( 0.00%) 4.94 ( 24.93%) 6.04 ( 8.32%) 3.63 ( 44.89%) Amean real-pound_times-21 7.19 ( 0.00%) 5.18 ( 28.01%) 6.68 ( 7.13%) 3.65 ( 49.22%) Amean real-pound_times-30 7.67 ( 0.00%) 5.27 ( 31.29%) 7.16 ( 6.63%) 3.62 ( 52.76%) Amean real-pound_times-48 7.88 ( 0.00%) 5.36 ( 31.98%) 7.28 ( 7.58%) 3.65 ( 53.71%) Amean real-pound_times-79 7.99 ( 0.00%) 5.39 ( 32.52%) 7.40 ( 7.42%) 3.65 ( 54.25%) Amean real-pound_times-96 8.01 ( 0.00%) 5.35 ( 33.20%) 7.36 ( 8.09%) 3.64 ( 54.49%) Amean syst-pound_clock_gettime-2 9.22 ( 0.00%) 5.45 ( 40.95%) 7.90 ( 14.32%) 2.34 ( 74.66%) Amean syst-pound_clock_gettime-5 27.03 ( 0.00%) 24.21 ( 10.40%) 26.24 ( 2.90%) 4.73 ( 82.48%) Amean syst-pound_clock_gettime-8 48.33 ( 0.00%) 38.40 ( 20.55%) 47.11 ( 2.52%) 6.64 ( 86.25%) Amean syst-pound_clock_gettime-12 78.93 ( 0.00%) 57.30 ( 27.41%) 71.04 ( 10.00%) 9.69 ( 87.72%) Amean syst-pound_clock_gettime-21 149.27 ( 0.00%) 105.34 ( 29.43%) 138.19 ( 7.42%) 16.50 ( 88.95%) Amean syst-pound_clock_gettime-30 178.36 ( 0.00%) 119.83 ( 32.82%) 166.75 ( 6.51%) 18.67 ( 89.53%) Amean syst-pound_clock_gettime-48 185.77 ( 0.00%) 124.80 ( 32.82%) 171.14 ( 7.88%) 21.12 ( 88.63%) Amean syst-pound_clock_gettime-79 188.17 ( 0.00%) 126.34 ( 32.86%) 173.99 ( 7.53%) 24.07 ( 87.21%) Amean syst-pound_clock_gettime-96 190.24 ( 0.00%) 126.63 ( 33.44%) 175.32 ( 7.84%) 26.12 ( 86.27%) Amean syst-pound_times-2 11.02 ( 0.00%) 6.91 ( 37.27%) 8.85 ( 19.68%) 4.36 ( 60.45%) Amean syst-pound_times-5 27.99 ( 0.00%) 21.31 ( 23.88%) 23.42 ( 16.32%) 17.95 ( 35.87%) Amean syst-pound_times-8 47.33 ( 0.00%) 39.27 ( 17.04%) 47.16 ( 0.35%) 28.56 ( 39.66%) Amean syst-pound_times-12 78.24 ( 0.00%) 58.26 ( 25.55%) 71.55 ( 8.55%) 42.78 ( 45.32%) Amean syst-pound_times-21 148.75 ( 0.00%) 106.28 ( 28.55%) 138.22 ( 7.08%) 74.25 ( 50.09%) Amean syst-pound_times-30 177.74 ( 0.00%) 121.16 ( 31.83%) 166.70 ( 6.21%) 81.82 ( 53.96%) Amean syst-pound_times-48 184.85 ( 0.00%) 125.37 ( 32.18%) 170.87 ( 7.56%) 84.20 ( 54.45%) Amean syst-pound_times-79 189.50 ( 0.00%) 127.45 ( 32.74%) 175.58 ( 7.34%) 86.01 ( 54.61%) Amean syst-pound_times-96 190.56 ( 0.00%) 127.11 ( 33.30%) 175.08 ( 8.12%) 86.03 ( 54.85%) Stddev real-pound_clock_gettime-2 0.25 ( 0.00%) 0.27 ( -7.76%) 0.41 (-65.62%) 0.10 ( 60.73%) Stddev real-pound_clock_gettime-5 0.07 ( 0.00%) 0.09 (-35.10%) 0.51 (-674.46%) 0.05 ( 26.28%) Stddev real-pound_clock_gettime-8 0.28 ( 0.00%) 0.02 ( 92.09%) 0.04 ( 86.10%) 0.02 ( 93.65%) Stddev real-pound_clock_gettime-12 0.08 ( 0.00%) 0.02 ( 78.31%) 0.04 ( 52.02%) 0.02 ( 78.95%) Stddev real-pound_clock_gettime-21 0.06 ( 0.00%) 0.02 ( 68.54%) 0.11 (-70.01%) 0.01 ( 78.27%) Stddev real-pound_clock_gettime-30 0.05 ( 0.00%) 0.01 ( 75.00%) 0.10 (-98.93%) 0.04 ( 20.82%) Stddev real-pound_clock_gettime-48 0.09 ( 0.00%) 0.19 (-106.51%) 0.08 ( 15.24%) 0.04 ( 58.70%) Stddev real-pound_clock_gettime-79 0.03 ( 0.00%) 0.10 (-191.56%) 0.08 (-138.02%) 0.04 (-21.18%) Stddev real-pound_clock_gettime-96 0.05 ( 0.00%) 0.08 (-56.69%) 0.07 (-21.04%) 0.04 ( 31.40%) Stddev real-pound_times-2 0.55 ( 0.00%) 0.25 ( 53.82%) 0.38 ( 30.80%) 0.14 ( 74.19%) Stddev real-pound_times-5 0.06 ( 0.00%) 0.28 (-358.77%) 0.13 (-108.26%) 0.03 ( 54.64%) Stddev real-pound_times-8 0.25 ( 0.00%) 0.04 ( 83.52%) 0.06 ( 76.99%) 0.06 ( 76.94%) Stddev real-pound_times-12 0.09 ( 0.00%) 0.05 ( 41.52%) 0.02 ( 77.55%) 0.04 ( 51.60%) Stddev real-pound_times-21 0.06 ( 0.00%) 0.15 (-141.91%) 0.11 (-74.22%) 0.03 ( 48.73%) Stddev real-pound_times-30 0.06 ( 0.00%) 0.14 (-129.04%) 0.10 (-66.59%) 0.04 ( 30.36%) Stddev real-pound_times-48 0.05 ( 0.00%) 0.13 (-151.20%) 0.07 (-37.30%) 0.02 ( 54.64%) Stddev real-pound_times-79 0.04 ( 0.00%) 0.11 (-205.48%) 0.07 (-97.82%) 0.03 ( 28.17%) Stddev real-pound_times-96 0.05 ( 0.00%) 0.05 ( -1.83%) 0.04 ( 24.17%) 0.04 ( 20.00%) Stddev syst-pound_clock_gettime-2 0.47 ( 0.00%) 0.45 ( 4.96%) 0.79 (-66.33%) 0.18 ( 61.36%) Stddev syst-pound_clock_gettime-5 0.32 ( 0.00%) 0.39 (-20.09%) 2.49 (-666.63%) 0.25 ( 21.71%) Stddev syst-pound_clock_gettime-8 2.25 ( 0.00%) 0.26 ( 88.54%) 0.40 ( 82.10%) 0.17 ( 92.55%) Stddev syst-pound_clock_gettime-12 1.23 ( 0.00%) 0.43 ( 64.59%) 0.73 ( 40.82%) 0.19 ( 84.58%) Stddev syst-pound_clock_gettime-21 1.15 ( 0.00%) 1.06 ( 7.62%) 2.64 (-129.56%) 0.66 ( 42.45%) Stddev syst-pound_clock_gettime-30 1.34 ( 0.00%) 1.26 ( 6.25%) 2.69 (-99.81%) 1.58 (-17.86%) Stddev syst-pound_clock_gettime-48 2.52 ( 0.00%) 4.85 (-92.44%) 2.12 ( 15.94%) 1.08 ( 57.23%) Stddev syst-pound_clock_gettime-79 1.22 ( 0.00%) 2.51 (-105.82%) 1.99 (-62.56%) 0.96 ( 21.62%) Stddev syst-pound_clock_gettime-96 1.54 ( 0.00%) 2.21 (-43.34%) 1.74 (-12.67%) 0.80 ( 48.24%) Stddev syst-pound_times-2 1.09 ( 0.00%) 0.50 ( 53.61%) 0.76 ( 30.43%) 0.28 ( 74.11%) Stddev syst-pound_times-5 0.30 ( 0.00%) 1.41 (-367.82%) 0.65 (-115.62%) 0.21 ( 29.66%) Stddev syst-pound_times-8 2.12 ( 0.00%) 0.27 ( 87.24%) 0.71 ( 66.44%) 0.55 ( 73.94%) Stddev syst-pound_times-12 1.03 ( 0.00%) 0.74 ( 27.70%) 0.37 ( 64.41%) 0.47 ( 54.81%) Stddev syst-pound_times-21 1.60 ( 0.00%) 3.07 (-92.49%) 2.30 (-43.99%) 0.93 ( 41.93%) Stddev syst-pound_times-30 1.75 ( 0.00%) 3.05 (-74.55%) 2.84 (-62.67%) 1.17 ( 32.95%) Stddev syst-pound_times-48 0.79 ( 0.00%) 3.36 (-327.41%) 2.51 (-219.14%) 0.51 ( 34.63%) Stddev syst-pound_times-79 1.08 ( 0.00%) 2.77 (-156.12%) 1.84 (-70.34%) 0.86 ( 20.82%) Stddev syst-pound_times-96 1.19 ( 0.00%) 1.35 (-13.61%) 1.01 ( 15.16%) 1.29 ( -8.56%) CoeffVar real-pound_clock_gettime-2 5.19 ( 0.00%) 8.73 (-68.19%) 9.97 (-92.16%) 7.43 (-43.23%) CoeffVar real-pound_clock_gettime-5 1.19 ( 0.00%) 1.77 (-48.40%) 9.49 (-696.07%) 4.66 (-291.28%) CoeffVar real-pound_clock_gettime-8 4.53 ( 0.00%) 0.45 ( 90.14%) 0.64 ( 85.76%) 1.95 ( 56.89%) CoeffVar real-pound_clock_gettime-12 1.24 ( 0.00%) 0.37 ( 70.42%) 0.66 ( 46.83%) 2.00 (-60.60%) CoeffVar real-pound_clock_gettime-21 0.88 ( 0.00%) 0.39 ( 55.78%) 1.61 (-82.98%) 1.58 (-79.84%) CoeffVar real-pound_clock_gettime-30 0.68 ( 0.00%) 0.25 ( 63.35%) 1.44 (-112.50%) 4.49 (-560.29%) CoeffVar real-pound_clock_gettime-48 1.18 ( 0.00%) 3.61 (-204.68%) 1.09 ( 8.14%) 4.06 (-243.23%) CoeffVar real-pound_clock_gettime-79 0.43 ( 0.00%) 1.85 (-333.44%) 1.10 (-157.77%) 3.87 (-808.42%) CoeffVar real-pound_clock_gettime-96 0.68 ( 0.00%) 1.59 (-134.97%) 0.89 (-31.18%) 3.28 (-383.77%) CoeffVar real-pound_times-2 9.79 ( 0.00%) 7.15 ( 26.98%) 8.39 ( 14.33%) 6.21 ( 36.61%) CoeffVar real-pound_times-5 1.06 ( 0.00%) 6.39 (-501.98%) 2.64 (-148.46%) 0.75 ( 29.71%) CoeffVar real-pound_times-8 4.24 ( 0.00%) 0.84 ( 80.17%) 0.98 ( 76.91%) 1.61 ( 61.95%) CoeffVar real-pound_times-12 1.29 ( 0.00%) 1.01 ( 22.11%) 0.32 ( 75.51%) 1.14 ( 12.18%) CoeffVar real-pound_times-21 0.87 ( 0.00%) 2.91 (-236.03%) 1.63 (-87.60%) 0.87 ( -0.97%) CoeffVar real-pound_times-30 0.78 ( 0.00%) 2.62 (-233.35%) 1.40 (-78.41%) 1.16 (-47.41%) CoeffVar real-pound_times-48 0.65 ( 0.00%) 2.40 (-269.32%) 0.97 (-48.56%) 0.64 ( 2.00%) CoeffVar real-pound_times-79 0.45 ( 0.00%) 2.03 (-352.70%) 0.96 (-113.68%) 0.71 (-57.00%) CoeffVar real-pound_times-96 0.61 ( 0.00%) 0.93 (-52.43%) 0.50 ( 17.50%) 1.07 (-75.79%) CoeffVar syst-pound_clock_gettime-2 5.12 ( 0.00%) 8.25 (-60.95%) 9.95 (-94.12%) 7.81 (-52.47%) CoeffVar syst-pound_clock_gettime-5 1.20 ( 0.00%) 1.61 (-34.04%) 9.48 (-689.57%) 5.37 (-346.99%) CoeffVar syst-pound_clock_gettime-8 4.66 ( 0.00%) 0.67 ( 85.58%) 0.86 ( 81.64%) 2.53 ( 45.79%) CoeffVar syst-pound_clock_gettime-12 1.56 ( 0.00%) 0.76 ( 51.21%) 1.02 ( 34.25%) 1.95 (-25.60%) CoeffVar syst-pound_clock_gettime-21 0.77 ( 0.00%) 1.01 (-30.89%) 1.91 (-147.96%) 4.01 (-420.63%) CoeffVar syst-pound_clock_gettime-30 0.75 ( 0.00%) 1.05 (-39.54%) 1.61 (-113.72%) 8.48 (-1026.12%) CoeffVar syst-pound_clock_gettime-48 1.36 ( 0.00%) 3.89 (-186.46%) 1.24 ( 8.75%) 5.11 (-276.18%) CoeffVar syst-pound_clock_gettime-79 0.65 ( 0.00%) 1.99 (-206.55%) 1.14 (-75.81%) 3.98 (-512.73%) CoeffVar syst-pound_clock_gettime-96 0.81 ( 0.00%) 1.74 (-115.35%) 0.99 (-22.26%) 3.05 (-277.01%) CoeffVar syst-pound_times-2 9.86 ( 0.00%) 7.29 ( 26.04%) 8.54 ( 13.39%) 6.45 ( 34.55%) CoeffVar syst-pound_times-5 1.08 ( 0.00%) 6.62 (-514.60%) 2.78 (-157.67%) 1.18 ( -9.68%) CoeffVar syst-pound_times-8 4.48 ( 0.00%) 0.69 ( 84.62%) 1.51 ( 66.32%) 1.94 ( 56.81%) CoeffVar syst-pound_times-12 1.32 ( 0.00%) 1.28 ( 2.89%) 0.51 ( 61.08%) 1.09 ( 17.35%) CoeffVar syst-pound_times-21 1.07 ( 0.00%) 2.89 (-169.42%) 1.66 (-54.96%) 1.25 (-16.34%) CoeffVar syst-pound_times-30 0.98 ( 0.00%) 2.52 (-156.06%) 1.71 (-73.43%) 1.43 (-45.64%) CoeffVar syst-pound_times-48 0.43 ( 0.00%) 2.68 (-530.20%) 1.47 (-245.25%) 0.61 (-43.50%) CoeffVar syst-pound_times-79 0.57 ( 0.00%) 2.17 (-280.81%) 1.05 (-83.84%) 1.00 (-74.46%) CoeffVar syst-pound_times-96 0.63 ( 0.00%) 1.07 (-70.33%) 0.58 ( 7.66%) 1.50 (-140.44%) Max real-pound_clock_gettime-2 5.10 ( 0.00%) 3.56 ( 30.20%) 4.98 ( 2.35%) 1.47 ( 71.18%) Max real-pound_clock_gettime-5 5.59 ( 0.00%) 5.10 ( 8.77%) 6.00 ( -7.33%) 1.17 ( 79.07%) Max real-pound_clock_gettime-8 6.82 ( 0.00%) 4.95 ( 27.42%) 6.02 ( 11.73%) 0.93 ( 86.36%) Max real-pound_clock_gettime-12 6.82 ( 0.00%) 4.93 ( 27.71%) 6.13 ( 10.12%) 0.90 ( 86.80%) Max real-pound_clock_gettime-21 7.33 ( 0.00%) 5.17 ( 29.47%) 7.01 ( 4.37%) 0.89 ( 87.86%) Max real-pound_clock_gettime-30 7.71 ( 0.00%) 5.24 ( 32.04%) 7.38 ( 4.28%) 1.00 ( 87.03%) Max real-pound_clock_gettime-48 8.11 ( 0.00%) 5.86 ( 27.74%) 7.47 ( 7.89%) 1.05 ( 87.05%) Max real-pound_clock_gettime-79 8.03 ( 0.00%) 5.53 ( 31.13%) 7.48 ( 6.85%) 1.13 ( 85.93%) Max real-pound_clock_gettime-96 8.05 ( 0.00%) 5.55 ( 31.06%) 7.51 ( 6.71%) 1.21 ( 84.97%) Max real-pound_times-2 6.66 ( 0.00%) 3.89 ( 41.59%) 5.23 ( 21.47%) 2.56 ( 61.56%) Max real-pound_times-5 5.77 ( 0.00%) 4.96 ( 14.04%) 5.01 ( 13.17%) 3.69 ( 36.05%) Max real-pound_times-8 6.42 ( 0.00%) 5.04 ( 21.50%) 6.02 ( 6.23%) 3.72 ( 42.06%) Max real-pound_times-12 6.69 ( 0.00%) 5.07 ( 24.22%) 6.07 ( 9.27%) 3.67 ( 45.14%) Max real-pound_times-21 7.32 ( 0.00%) 5.63 ( 23.09%) 7.00 ( 4.37%) 3.68 ( 49.73%) Max real-pound_times-30 7.78 ( 0.00%) 5.68 ( 26.99%) 7.36 ( 5.40%) 3.66 ( 52.96%) Max real-pound_times-48 7.98 ( 0.00%) 5.58 ( 30.08%) 7.41 ( 7.14%) 3.68 ( 53.88%) Max real-pound_times-79 8.05 ( 0.00%) 5.61 ( 30.31%) 7.53 ( 6.46%) 3.69 ( 54.16%) Max real-pound_times-96 8.08 ( 0.00%) 5.42 ( 32.92%) 7.42 ( 8.17%) 3.71 ( 54.08%) Max syst-pound_clock_gettime-2 9.91 ( 0.00%) 6.30 ( 36.43%) 9.64 ( 2.72%) 2.68 ( 72.96%) Max syst-pound_clock_gettime-5 27.53 ( 0.00%) 24.74 ( 10.13%) 29.35 ( -6.61%) 5.43 ( 80.28%) Max syst-pound_clock_gettime-8 53.96 ( 0.00%) 38.82 ( 28.06%) 47.75 ( 11.51%) 6.99 ( 87.05%) Max syst-pound_clock_gettime-12 81.09 ( 0.00%) 57.99 ( 28.49%) 71.93 ( 11.30%) 10.04 ( 87.62%) Max syst-pound_clock_gettime-21 151.50 ( 0.00%) 107.03 ( 29.35%) 145.33 ( 4.07%) 17.48 ( 88.46%) Max syst-pound_clock_gettime-30 179.94 ( 0.00%) 121.68 ( 32.38%) 172.10 ( 4.36%) 21.29 ( 88.17%) Max syst-pound_clock_gettime-48 191.29 ( 0.00%) 136.82 ( 28.48%) 174.84 ( 8.60%) 23.80 ( 87.56%) Max syst-pound_clock_gettime-79 190.22 ( 0.00%) 130.28 ( 31.51%) 177.26 ( 6.81%) 25.71 ( 86.48%) Max syst-pound_clock_gettime-96 192.02 ( 0.00%) 132.27 ( 31.12%) 178.26 ( 7.17%) 27.66 ( 85.60%) Max syst-pound_times-2 13.10 ( 0.00%) 7.57 ( 42.21%) 10.21 ( 22.06%) 4.89 ( 62.67%) Max syst-pound_times-5 28.56 ( 0.00%) 24.55 ( 14.04%) 24.80 ( 13.17%) 18.20 ( 36.27%) Max syst-pound_times-8 50.89 ( 0.00%) 39.54 ( 22.30%) 47.78 ( 6.11%) 29.45 ( 42.13%) Max syst-pound_times-12 79.85 ( 0.00%) 59.80 ( 25.11%) 72.21 ( 9.57%) 43.27 ( 45.81%) Max syst-pound_times-21 151.33 ( 0.00%) 115.02 ( 23.99%) 144.60 ( 4.45%) 75.85 ( 49.88%) Max syst-pound_times-30 180.79 ( 0.00%) 130.12 ( 28.03%) 171.98 ( 4.87%) 83.31 ( 53.92%) Max syst-pound_times-48 186.61 ( 0.00%) 130.89 ( 29.86%) 174.40 ( 6.54%) 84.85 ( 54.53%) Max syst-pound_times-79 190.96 ( 0.00%) 133.09 ( 30.30%) 179.58 ( 5.96%) 87.17 ( 54.35%) Max syst-pound_times-96 192.42 ( 0.00%) 128.95 ( 32.99%) 177.09 ( 7.97%) 87.82 ( 54.36%) vanilla rever prefetc mas 4.7 revert prefetch mask User 54.91 73.30 56.08 47.56 System 21115.14 14616.16 19553.36 6360.52 Elapsed 1247.71 890.24 1149.26 409.20 vanilla rever prefetc mas 4.7 revert prefetch mask Minor Faults 291321 267632 324632 274236 Major Faults 196 272 279 279 Swap Ins 0 0 0 0 Swap Outs 0 0 0 0 Allocation stalls 0 0 0 0 DMA allocs 0 0 0 0 DMA32 allocs 12836 11773 23439 21745 Normal allocs 252492 245667 302327 270404 Movable allocs 0 0 0 0 Direct pages scanned 0 0 0 0 Kswapd pages scanned 0 0 0 0 Kswapd pages reclaimed 0 0 0 0 Direct pages reclaimed 0 0 0 0 Kswapd efficiency 100% 100% 100% 100% Kswapd velocity 0.000 0.000 0.000 0.000 Direct efficiency 100% 100% 100% 100% Direct velocity 0.000 0.000 0.000 0.000 Percentage direct scans 0% 0% 0% 0% Zone normal velocity 0.000 0.000 0.000 0.000 Zone dma32 velocity 0.000 0.000 0.000 0.000 Zone dma velocity 0.000 0.000 0.000 0.000 Page writes by reclaim 0.000 0.000 0.000 0.000 Page writes file 0 0 0 0 Page writes anon 0 0 0 0 Page reclaim immediate 0 0 0 0 Sector Reads 24440 38464 144944 143876 Sector Writes 569300 12712 16036 6956 Page rescued immediate 0 0 0 0 Slabs scanned 0 0 0 0 Direct inode steals 0 0 0 0 Kswapd inode steals 0 0 0 0 Kswapd skipped wait 0 0 0 0 THP fault alloc 0 0 0 0 THP collapse alloc 0 0 0 0 THP splits 0 0 0 0 THP fault fallback 0 0 0 0 THP collapse fail 0 0 0 0 Compaction stalls 0 0 0 0 Compaction success 0 0 0 0 Compaction failures 0 0 0 0 Page migrate success 11177 10858 14598 9857 Page migrate failure 0 2 1 1 Compaction pages isolated 0 0 0 0 Compaction migrate scanned 0 0 0 0 Compaction free scanned 0 0 0 0 Compaction cost 11 11 15 10 NUMA alloc hit 237281 229068 296261 263464 NUMA alloc miss 7 5 5 6 NUMA interleave hit 0 0 0 0 NUMA alloc local 237281 229068 296261 263464 NUMA base PTE updates 25433 20398 35883 22264 NUMA huge PMD updates 0 0 0 0 NUMA page range updates 25433 20398 35883 22264 NUMA hint faults 23242 18097 31026 17002 NUMA hint local faults 10012 6038 14657 6903 NUMA hint local percent 43 33 47 40 NUMA pages migrated 11177 10858 14598 9857 AutoNUMA cost 116% 90% 155% 85%