Just a follow-up on this. We upgraded to 1.3, everything fine. Redid our test with the following results: # procs # listenerskq/s <http://tornado.pert.com.ar/twiki/bin/view/FAQ/PersonalDnsDistPruebasCarga?sortcol=2;table=1;up=0#sorted_table>% no error <http://tornado.pert.com.ar/twiki/bin/view/FAQ/PersonalDnsDistPruebasCarga?sortcol=3;table=1;up=0#sorted_table> 1 6 340 94% 1 9 380 98% 1 15 400 100% 1 15 500 99% 1 25 600 98% 3 6 500 99% Above 550 Kqps we were over 85% of interface usage and were many network errors.
One important thing we noticed is that in our production configuration we have ~60 drop rules : addAction('example.com', DropAction()) Whit these rules the performance dropped almost 30% We are moving those rules to the backend resolvers now, because it doesn't make sense to have this drop on performance when almost all our queries are cached (98% hit rate) Thanks again for this great software! Nico On Thu, Apr 6, 2017 at 6:33 PM Nico <nicom...@gmail.com> wrote: > Hi Remi, > Yes, the new version was almost 30% better in the full config test. Great! > > > So quite a noticeable gain but it looks like lock contention is still an > > issue. I would like to understand why, if you don't mind answering a few > > questions. > > > > - You mentioned having 32 cores, are they real cores or is it with > > hyper-threading? Intel reports [1] only 8 real cores for the E5-2660, so > > you should probably stick with at most 8 total threads per CPU > > (listeners mostly in your case). > You are right, this is with HT. > CPU(s): 32 > Thread(s) per core: 2 > Core(s) per socket: 8 > Socket(s): 2 > NUMA node(s): 2 > Model name: Intel(R) Xeon(R) CPU E5-2660 0 @ 2.20GHz > CPU MHz: 2194.783 > L1d cache: 32K > L1i cache: 32K > L2 cache: 256K > L3 cache: 20480K > > Regarding the number of listeners, we run the tests with different amounts > of listeners: > 1,2,4,8,12, > 1 listener was the worse, 120 Kqps, > the other configs were more or less the same oscillating from 165 to 175 > being the > 2 and 8 listeners configs the more stable.. > > > > - I'd be interested in the results of the dumpStats() and > > cache:printStats() commands during your test, as well as a perf top, > > ideally with a vanilla dnsdist and a dnsdist-concur. > See attached file > > > - The cache cleaning algo might be a bit aggressive by default, you can > > tweak it with: > > setCacheCleaningDelay(30) > > setCacheCleaningPercentage(20) > Done, but no impact. (because of our test set) > > > - Exporting carbon data to our public metronome instance would be great > > too, as it would immediately make a lot of metrics available to us. You > > can do that with: carbonServer('37.252.122.50 ', '<yourname>', 30) > unfortunately can't do that. This is on a closed net. > We have our own carbon-graphite and check the stats there. > can send you any additional info you like to get. > > > - Tuning the network buffer might also help: > > net.core.rmem_max=33554432 > > net.core.wmem_max=33554432 > > net.core.rmem_default=16777216 > > net.core.wmem_default=16777216 > Already done with very similar values. > Also tried kernel.sched_migration_cost_ns, but with no visible impact. > > > > - Would you consider upgrading your kernel? There has been a lot of > > improvements since 3.10.0, and we noticed huge performance increases in > > the past just by upgrading to a 4.x one. > I would like to do that, but we are required to use redhat.... > We've done some tests on a small core2 with 4 cores whith 4.9 and > we obtained almost the same results as in the "big one". > This was a surprise. > Trying to find a way (if security approves) to update redhat kernel. > > > > Oh and if you didn't already, would you mind setting > > setMaxUDPOutstanding() to 65535? Even at a 99% cache hit ratio, that > > leaves quite a few requests going to the backend so we better be sure we > > don't mess up these. The cache in dnsdist tries very hard not to degrade > > performance, so we prefer skipping the cache and passing the query to a > > backend rather than waiting for a cache lock, for example. > Already done, also no difference. > The queries we are sending are ~50 continously repeating. > > Will keep testing. But I think this is all we can get by now. > The optimum config now seems to be 3 processes with 6 or 8 listeners each. > Will have to do some workarounds on the stats (aggregation rules on > graphite?) and > service control scripts. > > Thanks again! > >
_______________________________________________ dnsdist mailing list dnsdist@mailman.powerdns.com https://mailman.powerdns.com/mailman/listinfo/dnsdist