On 08/17/2016 01:04 PM, Aaron Lu wrote: > On 08/16/2016 05:56 PM, Xin Long wrote: >>>>> >>>>> I'm testing on Linus' master, can we all use that please? >>>>> >>>> >>>> [git] git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git >>>> >>>> [mechine] >>>> Intel(R) Xeon(R) CPU E5-2690 v2 @ 3.00GHz >>>> mem 62G (66000220K) >>>> >>>> [system] >>>> # cat /etc/redhat-release >>>> Red Hat Enterprise Linux Server release 7.3 Beta (Maipo) >>>> >>>> [commit 3684b03] >>>> [root@hp-dl380pg8-11 lxin]# uname -r >>>> 4.8.0-rc2.3684b03 >>>> [root@hp-dl380pg8-11 lxin]# cat test.sh >>>> killall -0 netserver || netserver -4 & >>>> netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K -H 127.0.0.1 >>> >>> I just realized the test we are doing is not exactly the same. >>> As the original report says: >>> ip: ipv4 >>> runtime: 300s >>> nr_threads: 200% >>> cluster: cs-localhost >>> send_size: 10K >>> test: SCTP_STREAM_MANY >>> cpufreq_governor: performance >>> >>> Note the nr_threads: 200%, which means to start 2 times of CPU number >>> processes of netperf. >>> >>> In our IVB i3(2 cores, 2 threads per core) case, 8 netperf processes >>> are started concurrently: >> OK, understand. >> >>> >>> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K >>> -H 127.0.0.1 & >>> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K >>> -H 127.0.0.1 & >>> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K >>> -H 127.0.0.1 & >>> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K >>> -H 127.0.0.1 & >>> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K >>> -H 127.0.0.1 & >>> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K >>> -H 127.0.0.1 & >>> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K >>> -H 127.0.0.1 & >>> 2016-07-27 03:48:09 netperf -4 -t SCTP_STREAM_MANY -c -C -l 300 -- -m 10K >>> -H 127.0.0.1 & >>> >>> The throughput is the average of those runs. >>> >>> And I think we should be doing test on: >>> commit a6c2f79287 ("sctp: implement prsctp TTL policy") (the bisected one) >>> and >>> commit 826d253d57 ("sctp: add SCTP_PR_ASSOC_STATUS on sctp sockopt") (its >>> immediate parent) >>> instead of Linus' master HEAD to avoid other factors. >>> >> OK, I will do tests as your suggestion now, but need to rebuild again :D >> >> can you disable pr_enable with "sysctl -w net.sctp.prsctp_enable=0", >> then try again? > > For commit a6c2f79287 ("sctp: implement prsctp TTL policy"), no matter > the value of net.sctp.prsctp_enable, the throughput is almost the same:
The perf-profile data for the two commits are attached(for the case of prsctp_enable=1, the perf-profile data doesn't get collected for the 0 case for some reason, I'm checking the problem now). The CPU gets much more idle time in the bisected commit a6c2f79287: 68.89% 0.70% [kernel.kallsyms] [k] entry_SYSCALL_64_fastpath 49.32% 0.12% [kernel.kallsyms] [k] sys_sendmsg 49.17% 0.12% [kernel.kallsyms] [k] __sys_sendmsg 48.58% 0.22% [kernel.kallsyms] [k] ___sys_sendmsg 46.69% 0.06% [kernel.kallsyms] [k] sock_sendmsg 46.31% 0.16% [kernel.kallsyms] [k] inet_sendmsg 45.90% 0.98% [kernel.kallsyms] [k] sctp_sendmsg 29.66% 0.45% [kernel.kallsyms] [k] sctp_do_sm 29.54% 0.23% [kernel.kallsyms] [k] cpu_startup_entry 28.81% 0.68% [kernel.kallsyms] [k] sctp_cmd_interpreter.isra.24 26.20% 0.00% [kernel.kallsyms] [k] start_secondary 23.04% 0.09% [kernel.kallsyms] [k] sctp_inq_push 23.03% 0.08% [kernel.kallsyms] [k] call_cpuidle 22.94% 0.00% [kernel.kallsyms] [k] cpuidle_enter 22.60% 0.18% [kernel.kallsyms] [k] cpuidle_enter_state 21.99% 21.99% [kernel.kallsyms] [k] intel_idle ... ... While its immediate parent commit 826d253d57 is mostly busy working: 98.53% 0.83% [kernel.kallsyms] [k] entry_SYSCALL_64_fastpath 78.13% 0.12% [kernel.kallsyms] [k] sys_sendmsg 78.03% 0.16% [kernel.kallsyms] [k] __sys_sendmsg 77.08% 0.28% [kernel.kallsyms] [k] ___sys_sendmsg 74.44% 0.08% [kernel.kallsyms] [k] sock_sendmsg 73.82% 0.13% [kernel.kallsyms] [k] inet_sendmsg 73.34% 1.44% [kernel.kallsyms] [k] sctp_sendmsg 47.52% 0.75% [kernel.kallsyms] [k] sctp_do_sm 46.19% 0.90% [kernel.kallsyms] [k] sctp_cmd_interpreter.isra.24 37.17% 1.43% [kernel.kallsyms] [k] sctp_outq_flush 36.93% 0.08% [kernel.kallsyms] [k] sctp_outq_uncork 34.24% 0.15% [kernel.kallsyms] [k] sctp_inq_push ... ... No idle related function above 1%. Will the bisected commit make the idle possible? Thanks, Aaron
perf-profile-a6c2f79287.gz
Description: application/gzip
perf-profile-826d253d57.gz
Description: application/gzip