BIND Process failed during logrotate
I had the named process fail this past weekend on two secondaries running BIND 9.11.4-P2-RedHat-9.11.4-26.P2.el7_9.13. It seems that logrotate.d is calling the following script at the time of the failure. /var/named/data/named.run { missingok su named named create 0644 named named postrotate /usr/bin/systemctl reload named.service > /dev/null 2>&1 || true /usr/bin/systemctl reload named-chroot.service > /dev/null 2>&1 || true /usr/bin/systemctl reload named-sdb.service > /dev/null 2>&1 || true /usr/bin/systemctl reload named-sdb-chroot.service > /dev/null 2>&1 || true /usr/bin/systemctl reload named-pkcs11.service > /dev/null 2>&1 || true endscript } First of all, is this script part of the normal BIND distribution, or is it part of the RHEL 7 distribution? From what I can tell, it is called weekly. Poring through the BIND logs for the cause of the failure, I came across this. Note the server.c:2948 error message and subsequent failure. 19-Mar-2023 03:46:01.908 received control channel command 'reload' 19-Mar-2023 03:46:01.908 loading configuration from '/etc/named.conf' 19-Mar-2023 03:46:01.909 reading built-in trust anchors from file '/etc/named.root.key' 19-Mar-2023 03:46:01.909 GeoIP Country (IPv4) (type 1) DB not available 19-Mar-2023 03:46:01.909 GeoIP Country (IPv6) (type 12) DB not available 19-Mar-2023 03:46:01.909 GeoIP City (IPv4) (type 2) DB not available 19-Mar-2023 03:46:01.909 GeoIP City (IPv4) (type 6) DB not available 19-Mar-2023 03:46:01.909 GeoIP City (IPv6) (type 30) DB not available 19-Mar-2023 03:46:01.909 GeoIP City (IPv6) (type 31) DB not available 19-Mar-2023 03:46:01.909 GeoIP Region (type 3) DB not available 19-Mar-2023 03:46:01.909 GeoIP Region (type 7) DB not available 19-Mar-2023 03:46:01.909 GeoIP ISP (type 4) DB not available 19-Mar-2023 03:46:01.909 GeoIP Org (type 5) DB not available 19-Mar-2023 03:46:01.909 GeoIP AS (type 9) DB not available 19-Mar-2023 03:46:01.909 GeoIP Domain (type 11) DB not available 19-Mar-2023 03:46:01.909 GeoIP NetSpeed (type 10) DB not available 19-Mar-2023 03:46:01.909 using default UDP/IPv4 port range: [1024, 65535] 19-Mar-2023 03:46:01.909 using default UDP/IPv6 port range: [1024, 65535] 19-Mar-2023 03:46:01.910 sizing zone task pool based on 2 zones 19-Mar-2023 03:46:01.911 ../../../bin/named/server.c:2498: fatal error: 19-Mar-2023 03:46:01.911 RUNTIME_CHECK(tresult == 0) failed 19-Mar-2023 03:46:01.911 exiting (due to fatal error in library) Looking back a week earlier when the script last run, that server.c error was not there. Any thoughts on what could have caused this on two secondaries? The primary reloaded around the same time without incident. Thanks for your assistance. -- Visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list ISC funds the development of this software with paid support subscriptions. Contact us at https://www.isc.org/contact/ for more information. bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Containerizing BIND with Kubernetes
Is there any good source of documentation on containerizing an authoritative BIND instance in a Kubernetes cluster? The main part I’m trying to grasp is how to dynamically horizontally scale the cluster and keep the BIND notify process working between the containers. Thanks, Peter -- Visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list ISC funds the development of this software with paid support subscriptions. Contact us at https://www.isc.org/contact/ for more information. bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Re: isc python module
I don’t mean to hijack the thread, but I think this is related. I also use the BIND python modules. In particular, I'm using it to update my catalog zones as described here: https://kb.isc.org/docs/aa-01401 This document has several references to BIND 9.18 without any mention of the BIND python module being deprecated. What am I missing? I hope this helps... On 8/16/22, 8:12 AM, "bind-users on behalf of Petr Špaček" wrote: On 16. 08. 22 12:46, BÖSCH Christian wrote: > >> So my question is whether the isc python module no longer exists, and > >> whether there is an alternative? > > > > Please see release notes for 9.18.0, section Removed Features: > > > >https://bind9.readthedocs.io/en/v9_18_5/notes.html#removed-features > > > > Besides other things it links to copy of the library, (which is formally > > not supported outside of BIND 9.16, to be clear). > > Correcting myself: > The isc Python module is formally supported only as part of BIND 9.16 > codebase. > > Thanks Petr for your response. > And if the module is no longer supported is there no replacement or any other possibility > to deal with ansible or scripting the rndc commands? I'm not sure what "deal with Ansible" exactly mean :-) ISC does not and did not provide an Ansible module, I believe, so in that respect nothing has changed - use whatever third party software you were using before. The rndc protocol is not evolving at the moment, so it should be unlikely we break the compatibility in near future. Does it answer your question? -- Petr Špaček -- Visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list ISC funds the development of this software with paid support subscriptions. Contact us at https://www.isc.org/contact/ for more information. bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users -- Visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list ISC funds the development of this software with paid support subscriptions. Contact us at https://www.isc.org/contact/ for more information. bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Re: Bind 9.11/RHEL7 Server Freezes FUTEX_WAKE_PRIVATE
Greg, What other awesome stuff do you have on the top of your head? This makes sense as it’s running on EC2 @AWS (I.e. poor source of randomness on VM’s). And thanks to Grant for the haveged suggestion. Initial tests with haveged running seem to be positive. I’ll report back here if the problem continues. Thanks so much for your help! From: Greg Choules Date: Monday, August 1, 2022 at 6:21 PM To: White, Peter Cc: bind-users@lists.isc.org Subject: Re: Bind 9.11/RHEL7 Server Freezes FUTEX_WAKE_PRIVATE CAUTION: This email originated from outside of Penguin Random House. Please be extra cautious when opening file attachments or clicking on links. Hi Peter. Off the top of my head, could it be this? random-device The source of entropy to be used by the server. Entropy is primarily needed for DNSSEC operations, such as TKEY transactions and dynamic update of signed zones. This options specifies the device (or file) from which to read entropy. If this is a file, operations re- quiring entropy will fail when the file has been exhausted. If not specified, the default value is /dev/random (or equivalent) when present, and none otherwise. The random- device option takes effect during the initial configuration load at server startup time and is ignored on subsequent reloads. BIND will need a good source of randomness for crypto operations. Cheers, Greg On Mon, 1 Aug 2022 at 23:08, White, Peter mailto:pwh...@penguinrandomhouse.com>> wrote: I’m running BIND 9.11.4-P2-RedHat-9.11.4-26.P2.el7_9.9 (Extended Support Version) on RHEL 7 in a chroot jail. As of late, at times running some rndc commands are causing my server to lock up. It’s usually an “rndc addzone” that triggers the issue. I’ll also mention that I have recently started signing some domains with DNSSEC, so I suspect it may be somehow related. Here is an example of a command that frequently triggers my issue, although it doesn’t trigger it every time. rndc addzone '"example.com<http://example.com>" in external {type master; file "dnssec/example.com<http://example.com>";key-directory "keys"; auto-dnssec maintain; inline-signing yes;};' During these times, named will not respond to any rndc commands, nothing is logged to the bind logs (I’m running trace level 3 ), and will not answer queries. Everything seems just frozen in time. Waiting for a period of time, varying from a few seconds to many minutes, the server picks back up again and operates normally. The following are my observations to this point. CPU and memory show as being fine. top - 17:57:37 up 33 min, 3 users, load average: 0.00, 0.01, 0.05 Tasks: 125 total, 2 running, 123 sleeping, 0 stopped, 0 zombie %Cpu(s): 0.2 us, 0.3 sy, 0.0 ni, 98.5 id, 0.0 wa, 0.0 hi, 0.0 si, 1.0 s KiB Mem : 1842956 total, 439452 free, 665760 used, 737744 buff/cache KiB Swap: 8384508 total, 8384508 free,0 used. 1013652 avail Mem Strace shows the following over and over again. strace -p 1156 -f [pid 1159] futex(0x7fc1c15a307c, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 16657, {tv_sec=1659390139, tv_nsec=25586}, 0x) = -1 ETIMEDOUT (Connection timed out) Any pointers here would be greatly appreciated. I’m about at my wits end with this one, and rebuilding this server on a newer build of RHEL or recompiling BIND is not a journey that I would like to take at the moment. -- Visit https://lists.isc.org/mailman/listinfo/bind-users<https://lists.isc.org/mailman/listinfo/bind-users> to unsubscribe from this list ISC funds the development of this software with paid support subscriptions. Contact us at https://www.isc.org/contact/<https://www.isc.org/contact/> for more information. bind-users mailing list bind-users@lists.isc.org<mailto:bind-users@lists.isc.org> https://lists.isc.org/mailman/listinfo/bind-users<https://lists.isc.org/mailman/listinfo/bind-users> -- Visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list ISC funds the development of this software with paid support subscriptions. Contact us at https://www.isc.org/contact/ for more information. bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Bind 9.11/RHEL7 Server Freezes FUTEX_WAKE_PRIVATE
I’m running BIND 9.11.4-P2-RedHat-9.11.4-26.P2.el7_9.9 (Extended Support Version) on RHEL 7 in a chroot jail. As of late, at times running some rndc commands are causing my server to lock up. It’s usually an “rndc addzone” that triggers the issue. I’ll also mention that I have recently started signing some domains with DNSSEC, so I suspect it may be somehow related. Here is an example of a command that frequently triggers my issue, although it doesn’t trigger it every time. rndc addzone '"example.com" in external {type master; file "dnssec/example.com";key-directory "keys"; auto-dnssec maintain; inline-signing yes;};' During these times, named will not respond to any rndc commands, nothing is logged to the bind logs (I’m running trace level 3 ), and will not answer queries. Everything seems just frozen in time. Waiting for a period of time, varying from a few seconds to many minutes, the server picks back up again and operates normally. The following are my observations to this point. CPU and memory show as being fine. top - 17:57:37 up 33 min, 3 users, load average: 0.00, 0.01, 0.05 Tasks: 125 total, 2 running, 123 sleeping, 0 stopped, 0 zombie %Cpu(s): 0.2 us, 0.3 sy, 0.0 ni, 98.5 id, 0.0 wa, 0.0 hi, 0.0 si, 1.0 s KiB Mem : 1842956 total, 439452 free, 665760 used, 737744 buff/cache KiB Swap: 8384508 total, 8384508 free,0 used. 1013652 avail Mem Strace shows the following over and over again. strace -p 1156 -f [pid 1159] futex(0x7fc1c15a307c, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 16657, {tv_sec=1659390139, tv_nsec=25586}, 0x) = -1 ETIMEDOUT (Connection timed out) Any pointers here would be greatly appreciated. I’m about at my wits end with this one, and rebuilding this server on a newer build of RHEL or recompiling BIND is not a journey that I would like to take at the moment. -- Visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list ISC funds the development of this software with paid support subscriptions. Contact us at https://www.isc.org/contact/ for more information. bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users