Re: [dnsdist] dnsdist 1.8.0 thread spinning

2023-07-17 Thread Dustin Marquess via dnsdist
This looks like it might indeed be it! I'll apply it to our internal 1.8.0 
packages and give it a shot. Thanks!

-Dustin
On Jul 15, 2023 at 2:42 AM -0500, Otto Moerbeek , wrote:
> On Fri, Jul 14, 2023 at 03:06:12PM -0500, Dustin Marquess via dnsdist wrote:
>
> > So far we've had instances with dnsdist 1.8.0 having a thread in a tight 
> > loop. OS versions seem to vary widely, so I don't believe it's a glibc bug.
> >
> > Config on both is the same plain config:
> >
> > setLocal("127.0.0.1:53", {reusePort=true})
> > addLocal("127.0.0.1:53", {reusePort=true})
> > addLocal("127.0.0.1:53", {reusePort=true})
> > addLocal("127.0.0.1:53", {reusePort=true})
> > addACL('10.0.0.0/8')
> > newServer({address="10.112.104.116", checkType="A", checkClass=DNSClass.IN, 
> > checkName="hc.xxx.local", mustResolve=true, checkInterval=30})
> > newServer({address="10.112.106.177", checkType="A", checkClass=DNSClass.IN, 
> > checkName="hc.xxx.local", mustResolve=true, checkInterval=30})
> > newServer({address="10.9.41.68", checkType="A", checkClass=DNSClass.IN, 
> > checkName="hc.xxx.local", mustResolve=true, checkInterval=30})
> > setServerPolicy(firstAvailable)
> >
> > -- Tuning
> > setRingBuffersSize(100, 100)
> > setMaxTCPClientThreads(20)
> >
> > -- Caching
> > -- We should make these tunables configurable
> > pc = newPacketCache(10, {maxTTL=86400, minTTL=0, 
> > temporaryFailureTTL=60, staleTTL=60, dontAge=false})
> > getPool(""):setCache(pc)
> >
> > -- Don't try and hit the internet
> > setSecurityPollSuffix("")
> >
> > [pid  2990] recvfrom(-1, 0x7f3d9c0008c0, 4368, 0, NULL, NULL) = -1 EBADF 
> > (Bad file descriptor)
> > [pid  2990] recvfrom(-1, 0x7f3d9c0008c0, 4368, 0, NULL, NULL) = -1 EBADF 
> > (Bad file descriptor)
> > [pid  2990] recvfrom(-1, 0x7f3d9c0008c0, 4368, 0, NULL, NULL) = -1 EBADF 
> > (Bad file descriptor)
> > [pid  2990] recvfrom(-1, 0x7f3d9c0008c0, 4368, 0, NULL, NULL) = -1 EBADF 
> > (Bad file descriptor)
> >
> > In each case, a strace shows a bad recvfrom() call in a tight loop:
> >
> > Obviously -1 is a bad fd! Restarting dnsdist seems to resolve it. The only 
> > idea I can come up with is that when dnsdist first starts, it's unable to 
> > contact the upstream DNS servers and that somehow causes the issue. When we 
> > restart it, it IS able to contact them, and so works fine.
> >
> > Any ideas?
> >
> > Thanks!
> > -Dustin
>
> This is likely https://github.com/PowerDNS/pdns/pull/12726
>
> ATM this is not marked for backporting to 1.8.x. Don't know if that is
> an omission.
>
> -Otto
___
dnsdist mailing list
dnsdist@mailman.powerdns.com
https://mailman.powerdns.com/mailman/listinfo/dnsdist


Re: [dnsdist] dnsdist 1.8.0 thread spinning

2023-07-17 Thread Remi Gacogne via dnsdist

On 15/07/2023 09:42, Otto Moerbeek via dnsdist wrote:

This is likely https://github.com/PowerDNS/pdns/pull/12726

ATM this is not marked for backporting to 1.8.x. Don't know if that is
an omission.


It was, I added the 'backport to dnsdist-1.8.x' flag in the meantime. 
Thanks!


--
Remi Gacogne
PowerDNS.COM BV - https://www.powerdns.com/



OpenPGP_signature
Description: OpenPGP digital signature
___
dnsdist mailing list
dnsdist@mailman.powerdns.com
https://mailman.powerdns.com/mailman/listinfo/dnsdist


Re: [dnsdist] dnsdist 1.8.0 thread spinning

2023-07-15 Thread Otto Moerbeek via dnsdist
On Fri, Jul 14, 2023 at 03:06:12PM -0500, Dustin Marquess via dnsdist wrote:

> So far we've had instances with dnsdist 1.8.0 having a thread in a tight 
> loop. OS versions seem to vary widely, so I don't believe it's a glibc bug.
> 
> Config on both is the same plain config:
> 
> setLocal("127.0.0.1:53", {reusePort=true})
> addLocal("127.0.0.1:53", {reusePort=true})
> addLocal("127.0.0.1:53", {reusePort=true})
> addLocal("127.0.0.1:53", {reusePort=true})
> addACL('10.0.0.0/8')
> newServer({address="10.112.104.116", checkType="A", checkClass=DNSClass.IN, 
> checkName="hc.xxx.local", mustResolve=true, checkInterval=30})
> newServer({address="10.112.106.177", checkType="A", checkClass=DNSClass.IN, 
> checkName="hc.xxx.local", mustResolve=true, checkInterval=30})
> newServer({address="10.9.41.68", checkType="A", checkClass=DNSClass.IN, 
> checkName="hc.xxx.local", mustResolve=true, checkInterval=30})
> setServerPolicy(firstAvailable)
> 
> -- Tuning
> setRingBuffersSize(100, 100)
> setMaxTCPClientThreads(20)
> 
> -- Caching
> -- We should make these tunables configurable
> pc = newPacketCache(10, {maxTTL=86400, minTTL=0, temporaryFailureTTL=60, 
> staleTTL=60, dontAge=false})
> getPool(""):setCache(pc)
> 
> -- Don't try and hit the internet
> setSecurityPollSuffix("")
> 
> [pid  2990] recvfrom(-1, 0x7f3d9c0008c0, 4368, 0, NULL, NULL) = -1 EBADF (Bad 
> file descriptor)
> [pid  2990] recvfrom(-1, 0x7f3d9c0008c0, 4368, 0, NULL, NULL) = -1 EBADF (Bad 
> file descriptor)
> [pid  2990] recvfrom(-1, 0x7f3d9c0008c0, 4368, 0, NULL, NULL) = -1 EBADF (Bad 
> file descriptor)
> [pid  2990] recvfrom(-1, 0x7f3d9c0008c0, 4368, 0, NULL, NULL) = -1 EBADF (Bad 
> file descriptor)
> 
> In each case, a strace shows a bad recvfrom() call in a tight loop:
> 
> Obviously -1 is a bad fd! Restarting dnsdist seems to resolve it. The only 
> idea I can come up with is that when dnsdist first starts, it's unable to 
> contact the upstream DNS servers and that somehow causes the issue. When we 
> restart it, it IS able to contact them, and so works fine.
> 
> Any ideas?
> 
> Thanks!
> -Dustin

This is likely https://github.com/PowerDNS/pdns/pull/12726

ATM this is not marked for backporting to 1.8.x. Don't know if that is
an omission.

-Otto
___
dnsdist mailing list
dnsdist@mailman.powerdns.com
https://mailman.powerdns.com/mailman/listinfo/dnsdist


[dnsdist] dnsdist 1.8.0 thread spinning

2023-07-14 Thread Dustin Marquess via dnsdist
So far we've had instances with dnsdist 1.8.0 having a thread in a tight loop. 
OS versions seem to vary widely, so I don't believe it's a glibc bug.

Config on both is the same plain config:

setLocal("127.0.0.1:53", {reusePort=true})
addLocal("127.0.0.1:53", {reusePort=true})
addLocal("127.0.0.1:53", {reusePort=true})
addLocal("127.0.0.1:53", {reusePort=true})
addACL('10.0.0.0/8')
newServer({address="10.112.104.116", checkType="A", checkClass=DNSClass.IN, 
checkName="hc.xxx.local", mustResolve=true, checkInterval=30})
newServer({address="10.112.106.177", checkType="A", checkClass=DNSClass.IN, 
checkName="hc.xxx.local", mustResolve=true, checkInterval=30})
newServer({address="10.9.41.68", checkType="A", checkClass=DNSClass.IN, 
checkName="hc.xxx.local", mustResolve=true, checkInterval=30})
setServerPolicy(firstAvailable)

-- Tuning
setRingBuffersSize(100, 100)
setMaxTCPClientThreads(20)

-- Caching
-- We should make these tunables configurable
pc = newPacketCache(10, {maxTTL=86400, minTTL=0, temporaryFailureTTL=60, 
staleTTL=60, dontAge=false})
getPool(""):setCache(pc)

-- Don't try and hit the internet
setSecurityPollSuffix("")

[pid  2990] recvfrom(-1, 0x7f3d9c0008c0, 4368, 0, NULL, NULL) = -1 EBADF (Bad 
file descriptor)
[pid  2990] recvfrom(-1, 0x7f3d9c0008c0, 4368, 0, NULL, NULL) = -1 EBADF (Bad 
file descriptor)
[pid  2990] recvfrom(-1, 0x7f3d9c0008c0, 4368, 0, NULL, NULL) = -1 EBADF (Bad 
file descriptor)
[pid  2990] recvfrom(-1, 0x7f3d9c0008c0, 4368, 0, NULL, NULL) = -1 EBADF (Bad 
file descriptor)

In each case, a strace shows a bad recvfrom() call in a tight loop:

Obviously -1 is a bad fd! Restarting dnsdist seems to resolve it. The only idea 
I can come up with is that when dnsdist first starts, it's unable to contact 
the upstream DNS servers and that somehow causes the issue. When we restart it, 
it IS able to contact them, and so works fine.

Any ideas?

Thanks!
-Dustin
___
dnsdist mailing list
dnsdist@mailman.powerdns.com
https://mailman.powerdns.com/mailman/listinfo/dnsdist