Hi Remi, Yes, the new version was almost 30% better in the full config test. Great!
> So quite a noticeable gain but it looks like lock contention is still an > issue. I would like to understand why, if you don't mind answering a few > questions. > > - You mentioned having 32 cores, are they real cores or is it with > hyper-threading? Intel reports [1] only 8 real cores for the E5-2660, so > you should probably stick with at most 8 total threads per CPU > (listeners mostly in your case). You are right, this is with HT. CPU(s): 32 Thread(s) per core: 2 Core(s) per socket: 8 Socket(s): 2 NUMA node(s): 2 Model name: Intel(R) Xeon(R) CPU E5-2660 0 @ 2.20GHz CPU MHz: 2194.783 L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 20480K Regarding the number of listeners, we run the tests with different amounts of listeners: 1,2,4,8,12, 1 listener was the worse, 120 Kqps, the other configs were more or less the same oscillating from 165 to 175 being the 2 and 8 listeners configs the more stable.. > - I'd be interested in the results of the dumpStats() and > cache:printStats() commands during your test, as well as a perf top, > ideally with a vanilla dnsdist and a dnsdist-concur. See attached file > - The cache cleaning algo might be a bit aggressive by default, you can > tweak it with: > setCacheCleaningDelay(30) > setCacheCleaningPercentage(20) Done, but no impact. (because of our test set) > - Exporting carbon data to our public metronome instance would be great > too, as it would immediately make a lot of metrics available to us. You > can do that with: carbonServer('37.252.122.50 ', '<yourname>', 30) unfortunately can't do that. This is on a closed net. We have our own carbon-graphite and check the stats there. can send you any additional info you like to get. > - Tuning the network buffer might also help: > net.core.rmem_max=33554432 > net.core.wmem_max=33554432 > net.core.rmem_default=16777216 > net.core.wmem_default=16777216 Already done with very similar values. Also tried kernel.sched_migration_cost_ns, but with no visible impact. > - Would you consider upgrading your kernel? There has been a lot of > improvements since 3.10.0, and we noticed huge performance increases in > the past just by upgrading to a 4.x one. I would like to do that, but we are required to use redhat.... We've done some tests on a small core2 with 4 cores whith 4.9 and we obtained almost the same results as in the "big one". This was a surprise. Trying to find a way (if security approves) to update redhat kernel. > Oh and if you didn't already, would you mind setting > setMaxUDPOutstanding() to 65535? Even at a 99% cache hit ratio, that > leaves quite a few requests going to the backend so we better be sure we > don't mess up these. The cache in dnsdist tries very hard not to degrade > performance, so we prefer skipping the cache and passing the query to a > backend rather than waiting for a cache lock, for example. Already done, also no difference. The queries we are sending are ~50 continously repeating. Will keep testing. But I think this is all we can get by now. The optimum config now seems to be 3 processes with 6 or 8 listeners each. Will have to do some workarounds on the stats (aggregation rules on graphite?) and service control scripts. Thanks again!
RPM 8 listeners receiving constat 132 kqps traffic answering ~96% of queries > dumpStats() acl-drops 0 latency1-10 28 block-filter 0 latency10-50 13 cache-hits 14010959 latency100-1000 41 cache-misses 344 latency50-100 1 cpu-sys-msec 147910 no-policy 0 cpu-user-msec 122819 noncompliant-queries 0 downstream-send-errors 0 noncompliant-responses 0 downstream-timeouts 0 queries 14011303 dyn-block-nmg-size 0 rdqueries 14011303 dyn-blocked 0 real-memory-usage 261877760 empty-queries 0 responses 344 fd-usage 63 rule-drop 0 latency-avg100 0.0 rule-nxdomain 0 latency-avg1000 0.0 rule-refused 0 latency-avg10000 0.1 self-answered 0 latency-avg1000000 0.4 servfail-responses 0 latency-slow 0 trunc-failures 0 latency0-1 14011220 uptime 247 > cache:printStats() Entries: 54/100000 Hits: 14010959 Misses: 282 Deferred inserts: 15 Deferred lookups: 62 Lookup Collisions: 0 Insert Collisions: 0 TTL Too Shorts: 0 ============================ TOP top - 18:19:14 up 6 days, 30 min, 2 users, load average: 1.31, 1.04, 0.85 Tasks: 457 total, 1 running, 456 sleeping, 0 stopped, 0 zombie %Cpu0 : 2.7 us, 3.7 sy, 0.0 ni, 93.6 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu1 : 2.0 us, 1.3 sy, 0.0 ni, 96.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu2 : 11.0 us, 11.6 sy, 0.0 ni, 76.0 id, 0.0 wa, 0.0 hi, 1.4 si, 0.0 st %Cpu3 : 2.3 us, 2.7 sy, 0.0 ni, 95.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu4 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu5 : 1.7 us, 1.7 sy, 0.0 ni, 95.8 id, 0.0 wa, 0.0 hi, 0.8 si, 0.0 st %Cpu6 : 11.5 us, 14.1 sy, 0.0 ni, 74.4 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu7 : 0.7 us, 1.3 sy, 0.0 ni, 98.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu8 : 13.1 us, 16.2 sy, 0.0 ni, 70.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu9 : 9.2 us, 11.3 sy, 0.0 ni, 79.5 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu10 : 11.4 us, 12.8 sy, 0.0 ni, 75.8 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu11 : 10.1 us, 12.1 sy, 0.0 ni, 77.8 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu12 : 8.7 us, 9.7 sy, 0.0 ni, 81.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu13 : 9.4 us, 12.1 sy, 0.0 ni, 78.5 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu14 : 13.5 us, 15.8 sy, 0.0 ni, 70.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu15 : 4.7 us, 5.7 sy, 0.0 ni, 89.6 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu16 : 0.0 us, 0.3 sy, 0.0 ni, 99.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu17 : 0.3 us, 0.3 sy, 0.0 ni, 99.3 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu18 : 0.3 us, 0.3 sy, 0.0 ni, 99.3 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu19 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu20 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu21 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu22 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu23 : 0.0 us, 0.0 sy, 0.0 ni, 99.7 id, 0.3 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu24 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu25 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu26 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu27 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu28 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu29 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu30 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu31 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st KiB Mem : 49254044 total, 46658104 free, 1031952 used, 1563988 buff/cache KiB Swap: 24772604 total, 24772604 free, 0 used. 47697800 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 37316 root 20 0 4153176 259748 4216 S 242.4 0.5 6:50.31 dnsdist 2855 netdata 20 0 248256 100712 1724 S 0.7 0.2 46:23.50 netdata 24539 netdata 20 0 13788 2404 880 S 0.7 0.0 0:18.04 apps.plugin 162 root 20 0 0 0 0 S 0.3 0.0 28:31.97 ksoftirqd/5 1 root 20 0 190628 5736 2416 S 0.0 0.0 0:06.45 systemd 2 root 20 0 0 0 0 S 0.0 0.0 0:00.09 kthreadd 3 root 20 0 0 0 0 S 0.0 0.0 0:07.03 ksoftirqd/0 5 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/0:0H 7 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kworker/u129:0 8 root rt 0 0 0 0 S 0.0 0.0 0:00.07 migration/0 9 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcu_bh 10 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuob/0 11 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuob/1 12 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuob/2 13 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuob/3 14 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuob/4 15 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuob/5 ================ ss -ua State Recv-Q Send-Q Local Address:Port Peer Address:Port UNCONN 2711040 5888 11.22.33.44:domain *:* UNCONN 7758336 8704 11.22.33.44:domain *:* UNCONN 66816 5376 11.22.33.44:domain *:* UNCONN 12298752 8449 11.22.33.44:domain *:* UNCONN 121344 4352 11.22.33.44:domain *:* UNCONN 8369664 5376 11.22.33.44:domain *:* UNCONN 4817664 8704 11.22.33.44:domain *:* UNCONN 7569408 7680 11.22.33.44:domain *:* ================================================================================== ================================================================================== CONCUR 8 listeners receiving constat 177kqps traffic, answering ~92% of queries > dumpStats() acl-drops 0 latency1-10 275 block-filter 0 latency10-50 33 cache-hits 33147123 latency100-1000 58 cache-misses 1046 latency50-100 11 cpu-sys-msec 270489 no-policy 0 cpu-user-msec 308264 noncompliant-queries 0 downstream-send-errors 0 noncompliant-responses 0 downstream-timeouts 0 queries 33148177 dyn-block-nmg-size 0 rdqueries 33148177 dyn-blocked 0 real-memory-usage 263462912 empty-queries 0 responses 1046 fd-usage 61 rule-drop 0 latency-avg100 0.0 rule-nxdomain 0 latency-avg1000 0.0 rule-refused 0 latency-avg10000 0.0 self-answered 0 latency-avg1000000 0.1 servfail-responses 0 latency-slow 1 trunc-failures 0 latency0-1 33147791 uptime 964 > cache:printStats() Entries: 54/100000 Hits: 39150638 Misses: 1328 Deferred inserts: 0 Deferred lookups: 0 Lookup Collisions: 0 Insert Collisions: 0 TTL Too Shorts: 0 ======================= TOP top - 12:44:54 up 4 days, 18:55, 5 users, load average: 1.10, 0.77, 0.81 Tasks: 464 total, 1 running, 463 sleeping, 0 stopped, 0 zombie %Cpu0 : 11.9 us, 9.8 sy, 0.0 ni, 78.2 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu1 : 9.5 us, 8.8 sy, 0.0 ni, 81.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu2 : 0.0 us, 0.4 sy, 0.0 ni, 99.3 id, 0.0 wa, 0.0 hi, 0.4 si, 0.0 st %Cpu3 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu4 : 1.3 us, 2.0 sy, 0.0 ni, 96.6 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu5 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu6 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu7 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu8 : 17.5 us, 14.0 sy, 0.0 ni, 68.5 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu9 : 10.8 us, 9.8 sy, 0.0 ni, 79.3 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu10 : 13.4 us, 10.7 sy, 0.0 ni, 75.9 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu11 : 13.4 us, 11.0 sy, 0.0 ni, 75.6 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu12 : 17.0 us, 14.5 sy, 0.0 ni, 68.5 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu13 : 12.4 us, 10.3 sy, 0.0 ni, 77.3 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu14 : 13.9 us, 12.5 sy, 0.0 ni, 73.5 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu15 : 8.9 us, 8.2 sy, 0.0 ni, 82.8 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu16 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu17 : 0.0 us, 0.3 sy, 0.0 ni, 99.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu18 : 0.3 us, 0.0 sy, 0.0 ni, 99.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu19 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu20 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu21 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu22 : 0.0 us, 0.3 sy, 0.0 ni, 99.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu23 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu24 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu25 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu26 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu27 : 13.7 us, 11.3 sy, 0.0 ni, 74.9 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu28 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu29 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu30 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu31 : 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st KiB Mem : 49254044 total, 46663260 free, 1017104 used, 1573680 buff/cache KiB Swap: 24772604 total, 24772604 free, 0 used. 47760996 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 38566 root 20 0 4160416 261492 4416 S 295.7 0.5 12:40.53 dnsdist 2855 netdata 20 0 240060 100704 1724 S 0.7 0.2 36:54.98 netdata 24318 netdata 20 0 13760 2688 884 S 0.7 0.0 1:49.59 apps.plugin 18842 netdata 20 0 10036 1904 1212 S 0.3 0.0 0:05.66 bash 23291 root 20 0 146552 2488 1448 R 0.3 0.0 0:06.71 top 1 root 20 0 190628 5736 2416 S 0.0 0.0 0:05.65 systemd 2 root 20 0 0 0 0 S 0.0 0.0 0:00.08 kthreadd 3 root 20 0 0 0 0 S 0.0 0.0 0:05.86 ksoftirqd/0 5 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/0:0H 7 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kworker/u129:0 8 root rt 0 0 0 0 S 0.0 0.0 0:00.07 migration/0 9 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcu_bh 10 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuob/0 11 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuob/1 12 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuob/2 13 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuob/3 14 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuob/4 ================== ss -ua State Recv-Q Send-Q Local Address:Port Peer Address:Port UNCONN 14385408 4864 11.22.33.44:domain *:* UNCONN 16774656 7936 11.22.33.44:domain *:* UNCONN 11428608 13312 11.22.33.44:domain *:* UNCONN 10202112 2560 11.22.33.44:domain *:* UNCONN 14304768 4864 11.22.33.44:domain *:* UNCONN 9987840 2560 11.22.33.44:domain *:* UNCONN 16773888 6400 11.22.33.44:domain *:* UNCONN 16342272 3328 11.22.33.44:domain *:*
_______________________________________________ dnsdist mailing list dnsdist@mailman.powerdns.com https://mailman.powerdns.com/mailman/listinfo/dnsdist