Hi, Thanks for the guidance...!
We are testing with multiple scenarios, with/without kernel tuning. We observed UDP packets errors on both backend servers (not a single UDP error on dnsdist LB server). Tested with resperf 15K QPS resperf -s 192.168.0.1 -R -d queryfile-example-10million-201202 -C 100 -c 300 -r 0 -m 15000 -q 200000 Backend 1: 192.168.1.1 (without Kernel tuning): netstat -su IcmpMsg: InType3: 2229 InType8: 6 InType11: 194 OutType0: 6 OutType3: 762 Udp: 1634847 packets received 843 packets to unknown port received. 193891 packet receive errors 1859642 packets sent 193891 receive buffer errors 0 send buffer errors UdpLite: IpExt: InOctets: 580762744 OutOctets: 237368675 InNoECTPkts: 1995692 InECT0Pkts: 27 Backend 2: 192.168.1.2 (with Kernel Tuning): netstat -su IcmpMsg: InType3: 19177 InType8: 5802 InType11: 2645 OutType0: 5802 OutType3: 5122 Udp: 10798358 packets received 6846 packets to unknown port received. 4815377 packet receive errors 11949871 packets sent 4815377 receive buffer errors 0 send buffer errors UdpLite: IpExt: InNoRoutes: 11 InOctets: 3312682950 OutOctets: 1741771756 InNoECTPkts: 16355120 InECT1Pkts: 72 InECT0Pkts: 92 InCEPkts: 4 Kernel Tuning configured in /etc/rc.local ethtool -L eth0 combined 16 echo 52428800 > /proc/sys/net/netfilter/nf_conntrack_max sysctl -w net.core.rmem_max=33554432 sysctl -w net.core.wmem_max=33554432 sysctl -w net.core.rmem_default=16777216 sysctl -w net.core.wmem_default=16777216 sysctl -w net.core.netdev_max_backlog=65536 sysctl -w net.core.somaxconn=1024 ulimit -n 16000 Network config/ specs are same on all three servers, are we doing something wrong? Regards, Rais -----Original Message----- From: Klaus Darilion <klaus.daril...@nic.at> Sent: Thursday, March 24, 2022 12:38 PM To: Rais Ahmed <rais.ah...@tes.com.pk>; dnsdist@mailman.powerdns.com Subject: AW: [dnsdist] dnsdist[29321]: Marking downstream IP:53 as 'down' Have you tested how many Qps your Backend is capably to handle? First test your Backend performance to know how much qps a single backend can handle. I guess 500k qps might be difficult to achieve with bind. If you need more performance switch the Backend to NSD or Knot. regards Klaus > -----Ursprüngliche Nachricht----- > Von: dnsdist <dnsdist-boun...@mailman.powerdns.com> Im Auftrag von > Rais Ahmed via dnsdist > Gesendet: Mittwoch, 23. März 2022 22:02 > An: dnsdist@mailman.powerdns.com > Betreff: [dnsdist] dnsdist[29321]: Marking downstream IP:53 as 'down' > > Hi, > Thanks for reply...! > > We have configured setMaxUDPOutstanding(65535) and still we are seeing > backend down, logs are showing frequently as below. > > Timeout while waiting for the health check response from backend > 192.168.1.1:53 > Timeout while waiting for the health check response from backend > 192.168.1.2:53 > > Please have a look at below dnsdist configuration and help us to find > misconfiguration (16 Listeners & 8+8 backends added as per vCPUs > available > (2 Socket x 8 Cores): > > controlSocket('127.0.0.1:5199') > setKey("") > > ---- Listen addresses > addLocal('192.168.0.1:53', { reusePort=true }) > addLocal('192.168.0.1:53', { reusePort=true }) > addLocal('192.168.0.1:53', { reusePort=true }) > addLocal('192.168.0.1:53', { reusePort=true }) > addLocal('192.168.0.1:53', { reusePort=true }) > addLocal('192.168.0.1:53', { reusePort=true }) > addLocal('192.168.0.1:53', { reusePort=true }) > addLocal('192.168.0.1:53', { reusePort=true }) > addLocal('192.168.0.1:53', { reusePort=true }) > addLocal('192.168.0.1:53', { reusePort=true }) > addLocal('192.168.0.1:53', { reusePort=true }) > addLocal('192.168.0.1:53', { reusePort=true }) > addLocal('192.168.0.1:53', { reusePort=true }) > addLocal('192.168.0.1:53', { reusePort=true }) > addLocal('192.168.0.1:53', { reusePort=true }) > addLocal('192.168.0.1:53', { reusePort=true }) > > ---- Back-end server > newServer({address='192.168.1.1', maxCheckFailures=3, checkInterval=5, > weight=4, qps=40000, order=1}) newServer({address='192.168.1.1', > maxCheckFailures=3, checkInterval=5, weight=4, qps=40000, order=2}) > newServer({address='192.168.1.1', maxCheckFailures=3, checkInterval=5, > weight=4, qps=40000, order=3}) newServer({address='192.168.1.1', > maxCheckFailures=3, checkInterval=5, weight=4, qps=40000, order=4}) > newServer({address='192.168.1.1', maxCheckFailures=3, checkInterval=5, > weight=4, qps=40000, order=5}) newServer({address='192.168.1.1', > maxCheckFailures=3, checkInterval=5, weight=4, qps=40000, order=6}) > newServer({address='192.168.1.1', maxCheckFailures=3, checkInterval=5, > weight=4, qps=40000, order=7}) newServer({address='192.168.1.1', > maxCheckFailures=3, checkInterval=5, weight=4, qps=40000, order=8}) > newServer({address='192.168.1.2', maxCheckFailures=3, checkInterval=5, > weight=4, qps=40000, order=9}) newServer({address='192.168.1.2', > maxCheckFailures=3, checkInterval=5, weight=4, qps=40000, order=10}) > newServer({address='192.168.1.2', maxCheckFailures=3, checkInterval=5, > weight=4, qps=40000, order=11}) newServer({address='192.168.1.2', > maxCheckFailures=3, checkInterval=5, weight=4, qps=40000, order=12}) > newServer({address='192.168.1.2', maxCheckFailures=3, checkInterval=5, > weight=4, qps=40000, order=13}) newServer({address='192.168.1.2', > maxCheckFailures=3, checkInterval=5, weight=4, qps=40000, order=14}) > newServer({address='192.168.1.2', maxCheckFailures=3, checkInterval=5, > weight=4, qps=40000, order=15}) newServer({address='192.168.1.2', > maxCheckFailures=3, checkInterval=5, weight=4, qps=40000, order=16}) > > setMaxUDPOutstanding(65535) > > ---- Server Load Balancing Policy > setServerPolicy(leastOutstanding) > > ---- Web-server > webserver('192.168.0.1:8083') > setWebserverConfig({acl='192.168.0.0/24', password='Secret'}) > > ---- Customers Policy > customerACLs={'192.168.1.0/24'} > setACL(customerACLs) > > pc = newPacketCache(300000, {maxTTL=86400, minTTL=0, > temporaryFailureTTL=60, staleTTL=60, dontAge=false}) > getPool(""):setCache(pc) > > setVerboseHealthChecks(true) > > Servers Specs are as below: > Dnsdist LB Server Specs: 16 vCPUs, 16 GB RAM, Virtio NIC (10G) with 16 > Multiqueues. > Backend bind9 servers Specs: 16 vCPUs, 16GM RAM, Virtio NIC (10G) with > 16 Multiqueues. > > We are trying to handle 500K qps (will increase hardware specs, If > required) or with above specs atleast 100K qps. > > > Regards, > Rais > > -----Original Message----- > From: dnsdist <dnsdist-boun...@mailman.powerdns.com> On Behalf Of > dnsdist-requ...@mailman.powerdns.com > Sent: Wednesday, March 23, 2022 5:00 PM > To: dnsdist@mailman.powerdns.com > Subject: dnsdist Digest, Vol 79, Issue 3 > > Send dnsdist mailing list submissions to > dnsdist@mailman.powerdns.com > > To subscribe or unsubscribe via the World Wide Web, visit > https://mailman.powerdns.com/mailman/listinfo/dnsdist > or, via email, send a message with subject or body 'help' to > dnsdist-requ...@mailman.powerdns.com > > You can reach the person managing the list at > dnsdist-ow...@mailman.powerdns.com > > When replying, please edit your Subject line so it is more specific than "Re: > Contents of dnsdist digest..." > > > Today's Topics: > > 1. dnsdist[29321]: Marking downstream IP:53 as 'down' (Rais Ahmed) > 2. Re: dnsdist[29321]: Marking downstream IP:53 as 'down' > (Remi Gacogne) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Tue, 22 Mar 2022 23:00:25 +0000 > From: Rais Ahmed <rais.ah...@tes.com.pk> > To: "dnsdist@mailman.powerdns.com" <dnsdist@mailman.powerdns.com> > Subject: [dnsdist] dnsdist[29321]: Marking downstream IP:53 as 'down' > Message-ID: > <PAXPR08MB70737E4E1CCEFC4A7F61E1E6A0179@PAXPR08MB7073.e > urprd08.prod.outlook.com> > > Content-Type: text/plain; charset="us-ascii" > > Hi, > > We have configured dnsdist instance to handle around 500k QPS, but we > are seeing downstream down frequently once QPS reached above 25k. > below are the logs which we found to relative issue. > > dnsdist[29321]: Marking downstream server1 IP:53 as 'down' > dnsdist[29321]: Marking downstream server2 IP:53 as 'down' > -------------- next part -------------- An HTML attachment was > scrubbed... > URL: > <http://mailman.powerdns.com/pipermail/dnsdist/attachments/20220322/2 > befd6e2/attachment-0001.htm> > > ------------------------------ > > Message: 2 > Date: Wed, 23 Mar 2022 10:32:22 +0100 > From: Remi Gacogne <remi.gaco...@powerdns.com> > To: Rais Ahmed <rais.ah...@tes.com.pk>, "dnsdist@mailman.powerdns.com" > <dnsdist@mailman.powerdns.com> > Subject: Re: [dnsdist] dnsdist[29321]: Marking downstream IP:53 as > 'down' > Message-ID: <5a95cbeb-7c82-9bc1-0b4c-8726f8144...@powerdns.com> > Content-Type: text/plain; charset=UTF-8; format=flowed > > Hi, > > > We have configured dnsdist instance to handle around 500k QPS, but > we > are seeing downstream down frequently once QPS reached above 25k. > below > are the logs which we found to relative issue. > > > > dnsdist[29321]: Marking downstream server1 IP:53 as 'down' > > > > dnsdist[29321]: Marking downstream server2 IP:53 as 'down' > > You might be able to get more information about why the health-checks > are failing by adding setVerboseHealthChecks(true) to your configuration. > > It usually happens because the backend is overwhelmed and needs to be > tuned to handle the load, but it might also be caused by a network > issue, like a link reaching its maximum capacity, or by dnsdist itself > being overwhelmed and needing tuning (like increasing the number of > newServer() directives, see [1]). > > [1]: > https://dnsdist.org/advanced/tuning.html#udp-and-incoming-dns-over- > https > > Best regards, > -- > Remi Gacogne > PowerDNS.COM BV - https://www.powerdns.com/ > > > ------------------------------ > > Subject: Digest Footer > > _______________________________________________ > dnsdist mailing list > dnsdist@mailman.powerdns.com > https://mailman.powerdns.com/mailman/listinfo/dnsdist > > > ------------------------------ > > End of dnsdist Digest, Vol 79, Issue 3 > ************************************** > _______________________________________________ > dnsdist mailing list > dnsdist@mailman.powerdns.com > https://mailman.powerdns.com/mailman/listinfo/dnsdist _______________________________________________ dnsdist mailing list dnsdist@mailman.powerdns.com https://mailman.powerdns.com/mailman/listinfo/dnsdist