BIND Process failed during logrotate

2023-03-22 Thread White, Peter
I had the named process fail this past weekend on two secondaries running BIND 
9.11.4-P2-RedHat-9.11.4-26.P2.el7_9.13. It seems that logrotate.d is calling 
the following script at the time of the failure.

/var/named/data/named.run {
missingok
su named named
create 0644 named named
postrotate
/usr/bin/systemctl reload named.service > /dev/null 2>&1 || true
/usr/bin/systemctl reload named-chroot.service > /dev/null 2>&1 || true
/usr/bin/systemctl reload named-sdb.service > /dev/null 2>&1 || true
/usr/bin/systemctl reload named-sdb-chroot.service > /dev/null 2>&1 || 
true
/usr/bin/systemctl reload named-pkcs11.service > /dev/null 2>&1 || true
endscript
}

First of all, is this script part of the normal BIND distribution, or is it 
part of the RHEL 7 distribution? From what I can tell, it is called weekly.

Poring through the BIND logs for the cause of the failure, I came across this. 
Note the server.c:2948 error message and subsequent failure.
19-Mar-2023 03:46:01.908 received control channel command 'reload'
19-Mar-2023 03:46:01.908 loading configuration from '/etc/named.conf'
19-Mar-2023 03:46:01.909 reading built-in trust anchors from file 
'/etc/named.root.key'
19-Mar-2023 03:46:01.909 GeoIP Country (IPv4) (type 1) DB not available
19-Mar-2023 03:46:01.909 GeoIP Country (IPv6) (type 12) DB not available
19-Mar-2023 03:46:01.909 GeoIP City (IPv4) (type 2) DB not available
19-Mar-2023 03:46:01.909 GeoIP City (IPv4) (type 6) DB not available
19-Mar-2023 03:46:01.909 GeoIP City (IPv6) (type 30) DB not available
19-Mar-2023 03:46:01.909 GeoIP City (IPv6) (type 31) DB not available
19-Mar-2023 03:46:01.909 GeoIP Region (type 3) DB not available
19-Mar-2023 03:46:01.909 GeoIP Region (type 7) DB not available
19-Mar-2023 03:46:01.909 GeoIP ISP (type 4) DB not available
19-Mar-2023 03:46:01.909 GeoIP Org (type 5) DB not available
19-Mar-2023 03:46:01.909 GeoIP AS (type 9) DB not available
19-Mar-2023 03:46:01.909 GeoIP Domain (type 11) DB not available
19-Mar-2023 03:46:01.909 GeoIP NetSpeed (type 10) DB not available
19-Mar-2023 03:46:01.909 using default UDP/IPv4 port range: [1024, 65535]
19-Mar-2023 03:46:01.909 using default UDP/IPv6 port range: [1024, 65535]
19-Mar-2023 03:46:01.910 sizing zone task pool based on 2 zones
19-Mar-2023 03:46:01.911 ../../../bin/named/server.c:2498: fatal error:
19-Mar-2023 03:46:01.911 RUNTIME_CHECK(tresult == 0) failed
19-Mar-2023 03:46:01.911 exiting (due to fatal error in library)
Looking back a week earlier when the script last run, that server.c error was 
not there.

Any thoughts on what could have caused this on two secondaries? The primary 
reloaded around the same time without incident.

Thanks for your assistance.
-- 
Visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from 
this list

ISC funds the development of this software with paid support subscriptions. 
Contact us at https://www.isc.org/contact/ for more information.


bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Containerizing BIND with Kubernetes

2022-12-06 Thread White, Peter
Is there any good source of documentation on containerizing an authoritative 
BIND instance in a Kubernetes cluster?

The main part I’m trying to grasp is how to dynamically horizontally scale the 
cluster and keep the BIND notify process working between the containers.

Thanks,
Peter
-- 
Visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from 
this list

ISC funds the development of this software with paid support subscriptions. 
Contact us at https://www.isc.org/contact/ for more information.


bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: isc python module

2022-08-16 Thread White, Peter
I don’t mean to hijack the thread, but I think this is related. I also use the 
BIND python modules. In particular, I'm using it to update my catalog zones as 
described here: https://kb.isc.org/docs/aa-01401

This document has several references to BIND 9.18 without any mention of the 
BIND python module being deprecated. What am I missing? I hope this helps...

On 8/16/22, 8:12 AM, "bind-users on behalf of Petr Špaček" 
 wrote:

On 16. 08. 22 12:46, BÖSCH Christian wrote:
>  >> So my question is whether the isc python module no longer exists, 
and
>  >> whether there is an alternative?
>  >
>  > Please see release notes for 9.18.0, section Removed Features:
>  >
>  >https://bind9.readthedocs.io/en/v9_18_5/notes.html#removed-features
>  >
>  > Besides other things it links to copy of the library, (which is 
formally
>  > not supported outside of BIND 9.16, to be clear).
> 
>  Correcting myself:
>  The isc Python module is formally supported only as part of BIND 9.16
>  codebase.
> 
> Thanks Petr for your response.
> And if the module is no longer supported is there no replacement or any 
other possibility
> to deal with ansible or scripting the rndc commands?

I'm not sure what "deal with Ansible" exactly mean :-) ISC does not and 
did not provide an Ansible module, I believe, so in that respect nothing 
has changed - use whatever third party software you were using before.

The rndc protocol is not evolving at the moment, so it should be 
unlikely we break the compatibility in near future.

Does it answer your question?

-- 
Petr Špaček
-- 
Visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from 
this list

ISC funds the development of this software with paid support subscriptions. 
Contact us at https://www.isc.org/contact/ for more information.


bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users

-- 
Visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from 
this list

ISC funds the development of this software with paid support subscriptions. 
Contact us at https://www.isc.org/contact/ for more information.


bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: Bind 9.11/RHEL7 Server Freezes FUTEX_WAKE_PRIVATE

2022-08-01 Thread White, Peter
Greg, What other awesome stuff do you have on the top of your head? This makes 
sense as it’s running on EC2 @AWS (I.e. poor source of randomness on VM’s).

And thanks to Grant for the haveged suggestion.  Initial tests with haveged 
running seem to be positive.

I’ll report back here if the problem continues.

Thanks so much for your help!


From: Greg Choules 
Date: Monday, August 1, 2022 at 6:21 PM
To: White, Peter 
Cc: bind-users@lists.isc.org 
Subject: Re: Bind 9.11/RHEL7 Server Freezes FUTEX_WAKE_PRIVATE
CAUTION: This email originated from outside of Penguin Random House. Please be 
extra cautious when opening file attachments or clicking on links.

Hi Peter.
Off the top of my head, could it be this?

random-device

The source of entropy to be used by the server. Entropy is primarily needed for 
DNSSEC operations, such as TKEY transactions and dynamic update of signed 
zones. This options specifies the device (or file) from which to read entropy. 
If this is a file, operations re- quiring entropy will fail when the file has 
been exhausted. If not specified, the default value is /dev/random (or 
equivalent) when present, and none otherwise. The random- device option takes 
effect during the initial configuration load at server startup time and is 
ignored on subsequent reloads.

BIND will need a good source of randomness for crypto operations.

Cheers, Greg

On Mon, 1 Aug 2022 at 23:08, White, Peter 
mailto:pwh...@penguinrandomhouse.com>> wrote:

I’m running BIND 9.11.4-P2-RedHat-9.11.4-26.P2.el7_9.9 (Extended Support 
Version) on RHEL 7 in a chroot jail.

As of late, at times running some rndc commands are causing my server to lock 
up. It’s usually an “rndc addzone” that triggers the issue. I’ll also mention 
that I have recently started signing some domains with DNSSEC, so I suspect it 
may be somehow related.

Here is an example of a command that frequently triggers my issue, although it 
doesn’t trigger it every time.

rndc addzone '"example.com<http://example.com>" in external {type master; file 
"dnssec/example.com<http://example.com>";key-directory "keys"; auto-dnssec 
maintain; inline-signing yes;};'

During these times, named will not respond to any rndc commands, nothing is 
logged to the bind logs (I’m running trace level 3 ), and will not answer 
queries. Everything seems just frozen in time. Waiting for a period of time, 
varying from a few seconds to many minutes, the server picks back up again and 
operates normally. The following are my observations to this point.

CPU and memory show as being fine.


top - 17:57:37 up 33 min,  3 users,  load average: 0.00, 0.01, 0.05

Tasks: 125 total,   2 running, 123 sleeping,   0 stopped,   0 zombie

%Cpu(s):  0.2 us,  0.3 sy,  0.0 ni, 98.5 id,  0.0 wa,  0.0 hi,  0.0 si,  1.0 s

KiB Mem :  1842956 total,   439452 free,   665760 used,   737744 buff/cache

KiB Swap:  8384508 total,  8384508 free,0 used.  1013652 avail Mem

Strace shows the following over and over again.


strace -p 1156 -f



[pid  1159] futex(0x7fc1c15a307c, 
FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 16657, {tv_sec=1659390139, 
tv_nsec=25586}, 0x) = -1 ETIMEDOUT (Connection timed out)


Any pointers here would be greatly appreciated. I’m about at my wits end with 
this one, and rebuilding this server on a newer build of RHEL or recompiling 
BIND is not a journey that I would like to take at the moment.
--
Visit 
https://lists.isc.org/mailman/listinfo/bind-users<https://lists.isc.org/mailman/listinfo/bind-users>
 to unsubscribe from this list

ISC funds the development of this software with paid support subscriptions. 
Contact us at https://www.isc.org/contact/<https://www.isc.org/contact/> for 
more information.


bind-users mailing list
bind-users@lists.isc.org<mailto:bind-users@lists.isc.org>
https://lists.isc.org/mailman/listinfo/bind-users<https://lists.isc.org/mailman/listinfo/bind-users>
-- 
Visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from 
this list

ISC funds the development of this software with paid support subscriptions. 
Contact us at https://www.isc.org/contact/ for more information.


bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Bind 9.11/RHEL7 Server Freezes FUTEX_WAKE_PRIVATE

2022-08-01 Thread White, Peter
I’m running BIND 9.11.4-P2-RedHat-9.11.4-26.P2.el7_9.9 (Extended Support 
Version) on RHEL 7 in a chroot jail.

As of late, at times running some rndc commands are causing my server to lock 
up. It’s usually an “rndc addzone” that triggers the issue. I’ll also mention 
that I have recently started signing some domains with DNSSEC, so I suspect it 
may be somehow related.

Here is an example of a command that frequently triggers my issue, although it 
doesn’t trigger it every time.

rndc addzone '"example.com" in external {type master; file 
"dnssec/example.com";key-directory "keys"; auto-dnssec maintain; inline-signing 
yes;};'

During these times, named will not respond to any rndc commands, nothing is 
logged to the bind logs (I’m running trace level 3 ), and will not answer 
queries. Everything seems just frozen in time. Waiting for a period of time, 
varying from a few seconds to many minutes, the server picks back up again and 
operates normally. The following are my observations to this point.

CPU and memory show as being fine.


top - 17:57:37 up 33 min,  3 users,  load average: 0.00, 0.01, 0.05

Tasks: 125 total,   2 running, 123 sleeping,   0 stopped,   0 zombie

%Cpu(s):  0.2 us,  0.3 sy,  0.0 ni, 98.5 id,  0.0 wa,  0.0 hi,  0.0 si,  1.0 s

KiB Mem :  1842956 total,   439452 free,   665760 used,   737744 buff/cache

KiB Swap:  8384508 total,  8384508 free,0 used.  1013652 avail Mem

Strace shows the following over and over again.


strace -p 1156 -f



[pid  1159] futex(0x7fc1c15a307c, 
FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 16657, {tv_sec=1659390139, 
tv_nsec=25586}, 0x) = -1 ETIMEDOUT (Connection timed out)


Any pointers here would be greatly appreciated. I’m about at my wits end with 
this one, and rebuilding this server on a newer build of RHEL or recompiling 
BIND is not a journey that I would like to take at the moment.
-- 
Visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from 
this list

ISC funds the development of this software with paid support subscriptions. 
Contact us at https://www.isc.org/contact/ for more information.


bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users