Re: BIND 9.9.1-P4 is now available
Hello Jeremy, Thank you for your reply. I plan to send more information to ISC when I have it - FYI Looks like my response didn't make it out yesterday, so here is another attempt. Please see my responses within below: - Original Message - > From: Jeremy C. Reed > To: Fr34k > Cc: Bindlist > Sent: Thursday, October 25, 2012 3:29 PM > Subject: Re: BIND 9.9.1-P4 is now available > >> Let me define what "hung" means in our experience: We find that > named is >> running but will not respond to queries, "rndc status" will > respond with >> output but that output shows that named is not processing any queries (see >> below), other rndc commands appear to work as well (e.g., "rndc > dumpdb"). > > Does it work if you restart named? Yes. That is, when we restart named/9.9.1-P3 it works as well as it did since it was installed 10/3/2012 > > If not, can you confirm it is listening on your intended interfaces > (including 127.0.0.1) even if not working? > >> $ time host www.google.com 127.0.0.1 >> ;; connection timed out; no servers could be reached > > Can you confirm that you can query for that without? (Such as dig > @216.239.34.10 www.google.com or dig @8.8.8.8 www.google.com) > Yes, and I just didn't provide any of those examples (sorry). That is, I can say that any query (localhost or 3rd party hostnames) results in same outcome of "connection timed out; no servers could be reached". >> $ time host localhost 127.0.0.1 >> ;; connection timed out; no servers could be reached > > Do you have a localhost zone defined? (Sometimes the messages from host > like the one above are misleading and even the named may be working > correctly but it is slow.) While do have a localhost zone defined, any of our spot checks for local vs. off-network queries would fail. Once we restart 9.9.1-P3, everything works again > > Jeremy C. Reed > ISC > ___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Re: BIND 9.9.1-P4 is now available
Hello Jeremy, Thank you for your reply. >> Let me define what "hung" means in our experience: We find that named is >> running but will not respond to queries, "rndc status" will respond with >> output but that output shows that named is not processing any queries (see >> below), other rndc commands appear to work as well (e.g., "rndc dumpdb"). > >Does it work if you restart named? Yes. That is, everything is up and running again after we restart named. 9.9.1-P3 has been running on several servers since 10/3 without any known issues... until today. > >If not, can you confirm it is listening on your intended interfaces >(including 127.0.0.1) even if not working? > >> $ time host www.google.com 127.0.0.1 >> ;; connection timed out; no servers could be reached > >Can you confirm that you can query for that without? (Such as dig >@216.239.34.10 www.google.com or dig @8.8.8.8 www.google.com) > >> $ time host localhost 127.0.0.1 >> ;; connection timed out; no servers could be reached > >Do you have a localhost zone defined? (Sometimes the messages from host >like the one above are misleading and even the named may be working >correctly but it is slow.) Yes, we do have a localhost zone defined. However, queries for 3rd party hostnames (e.g., www.google.com) were failing as well. > > Jeremy C. Reed > ISC > > ___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Re: BIND 9.9.1-P4 is now available
> Let me define what "hung" means in our experience: We find that named is > running but will not respond to queries, "rndc status" will respond with > output but that output shows that named is not processing any queries (see > below), other rndc commands appear to work as well (e.g., "rndc dumpdb"). Does it work if you restart named? If not, can you confirm it is listening on your intended interfaces (including 127.0.0.1) even if not working? > $ time host www.google.com 127.0.0.1 > ;; connection timed out; no servers could be reached Can you confirm that you can query for that without? (Such as dig @216.239.34.10 www.google.com or dig @8.8.8.8 www.google.com) > $ time host localhost 127.0.0.1 > ;; connection timed out; no servers could be reached Do you have a localhost zone defined? (Sometimes the messages from host like the one above are misleading and even the named may be working correctly but it is slow.) Jeremy C. Reed ISC___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Re: BIND 9.9.1-P4 is now available
Hello Again, I could have made my question a bit more clear as I try to understand the details behind what P4 addresses. Perhaps I am having an internal battle between logic vs. interpretation around "or". Let me explain. I'm wondering if a named process affected by CVE-2012-5166 has symptoms of both (1) "not respond to queries" and (2) "not respond to control commands" at the same time, all the time. If that is the case, then P4 will not address my issue as I am only seeing (1) and so there may be another bug affecting BIND stability which I would like to report. Thank you. > > From: Fr34k >To: Bindlist >Sent: Thursday, October 25, 2012 9:51 AM >Subject: Re: BIND 9.9.1-P4 is now available > > >Hello, > > >We are finding several of our recursive BIND 9.9.1-P3 servers (on Solaris 10 OS) hung and I want to be able to qualify the symptoms in order to convince others that P4 (or 9.9.2?) will (or will not) address this. > > >Let me define what "hung" means in our experience: We find that named is running but will not respond to queries, "rndc status" will respond with output but that output shows that named is not processing any queries (see below), other rndc commands appear to work as well (e.g., "rndc dumpdb"). > > > >From what I understand, P4 offers this known bug fix: > > >* A deliberately constructed combination of records could cause named > to hang while populating the additional section of a response. > [RT #31090] -- CVE-2012-5166: Specially crafted DNS data can cause a lockup >in named > > >Additional details are mentioned in https://kb.isc.org/article/AA-00801/74/CVE-2012-5166%3A-Specially-crafted-DNS-data-can-cause-a-lockup-in-named.html: "A nameserver that has become locked-up due to the problem reported in this advisory will not respond to queries or control commands." > > >So, our hang issue qualifies for the "...will not respond to queries"; however, it seems that our issue does *not* qualify for the "... will not respond to... control commands" piece if the responses from "rndc" are considered control command. > > >Thoughts? > > >Thank you. > > > >$ rndc status >version: 9.9.1-P3 (version.bind/txt/ch disabled) >CPUs found: 2 >worker threads: 2 >UDP listeners per interface: 2 >number of zones: 36 >debug level: 0 >xfers running: 0 >xfers deferred: 0 >soa queries in progress: 0 >query logging is OFF >recursive clients: 0/3900/4000 >tcp clients: 0/100 >server is up and running > > >$ time host www.google.com 127.0.0.1 >;; connection timed out; no servers could be reached > >real 0m10.035s >user 0m0.017s >sys 0m0.017s >$ time host localhost 127.0.0.1 >;; connection timed out; no servers could be reached > >real 0m10.034s >user 0m0.017s >sys 0m0.017s > > >$ truss -p 17657 >/4: lwp_park(0xFE9AFD48, 0) (sleeping...) >/3: lwp_park(0x, 0) (sleeping...) >/1: sigtimedwait(0xFFBFFBE8, 0xFFBFFB68, 0x) (sleeping...) >/2: lwp_park(0x, 0) (sleeping...) >/5: ioctl(8, DP_POLL, 0xFE98FF80) (sleeping...) >___ >Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe >from this list > >bind-users mailing list >bind-users@lists.isc.org >https://lists.isc.org/mailman/listinfo/bind-users > >___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Re: BIND 9.9.1-P4 is now available
Hello, We are finding several of our recursive BIND 9.9.1-P3 servers (on Solaris 10 OS) hung and I want to be able to qualify the symptoms in order to convince others that P4 (or 9.9.2?) will (or will not) address this. Let me define what "hung" means in our experience: We find that named is running but will not respond to queries, "rndc status" will respond with output but that output shows that named is not processing any queries (see below), other rndc commands appear to work as well (e.g., "rndc dumpdb"). From what I understand, P4 offers this known bug fix: * A deliberately constructed combination of records could cause named to hang while populating the additional section of a response. [RT #31090] -- CVE-2012-5166: Specially crafted DNS data can cause a lockup in named Additional details are mentioned in https://kb.isc.org/article/AA-00801/74/CVE-2012-5166%3A-Specially-crafted-DNS-data-can-cause-a-lockup-in-named.html: "A nameserver that has become locked-up due to the problem reported in this advisory will not respond to queries or control commands." So, our hang issue qualifies for the "...will not respond to queries"; however, it seems that our issue does *not* qualify for the "... will not respond to... control commands" piece if the responses from "rndc" are considered control command. Thoughts? Thank you. $ rndc status version: 9.9.1-P3 (version.bind/txt/ch disabled) CPUs found: 2 worker threads: 2 UDP listeners per interface: 2 number of zones: 36 debug level: 0 xfers running: 0 xfers deferred: 0 soa queries in progress: 0 query logging is OFF recursive clients: 0/3900/4000 tcp clients: 0/100 server is up and running $ time host www.google.com 127.0.0.1 ;; connection timed out; no servers could be reached real 0m10.035s user 0m0.017s sys 0m0.017s $ time host localhost 127.0.0.1 ;; connection timed out; no servers could be reached real 0m10.034s user 0m0.017s sys 0m0.017s $ truss -p 17657 /4: lwp_park(0xFE9AFD48, 0) (sleeping...) /3: lwp_park(0x, 0) (sleeping...) /1: sigtimedwait(0xFFBFFBE8, 0xFFBFFB68, 0x) (sleeping...) /2: lwp_park(0x, 0) (sleeping...) /5: ioctl(8, DP_POLL, 0xFE98FF80) (sleeping...)___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
BIND 9.9.1-P4 is now available
Introduction BIND 9.9.1-P4 is the latest production release of BIND 9.9.1 (BIND 9.9.2 is also available for download and is the latest production release of BIND 9.9). This document summarizes changes from BIND 9.9.0 to BIND 9.9.1-P4. Please see the CHANGES file in the source code release for a complete list of all changes. Download The latest versions of BIND 9 software can always be found on our web site at http://www.isc.org/downloads/all. There you will find additional information about each release, source code, and pre-compiled versions for Microsoft Windows operating systems. Support Product support information is available on http://www.isc.org/services/support for paid support options. Free support is provided by our user community via a mailing list. Information on all public email lists is available at https://lists.isc.org/mailman/listinfo. Security Fixes * A deliberately constructed combination of records could cause named to hang while populating the additional section of a response. [RT #31090] * Prevents a named assert (crash) when queried for a record whose RDATA exceeds 65535 bytes. [RT #30416] * Prevents a named assert (crash) when validating caused by using "Bad cache" data before it has been initialized. [RT #30025] * ISC_QUEUE handling for recursive clients was updated to address a race condition that could cause a memory leak. This rarely occurred with UDP clients, but could be a significant problem for a server handling a steady rate of TCP queries. [RT #29539 & #30233] * A condition has been corrected where improper handling of zero-length RDATA could cause undesirable behavior, including termination of the named process. [RT #29644] New Features * None Feature Changes * BIND now recognizes the TLSA resource record type, created to support IETF DANE (DNS-based Authentication of Named Entities) [RT #28989] * A note will be added to the README in future releases to explain that the improved scalability provided by using multiple threads to listen for and process queries (change 3137, RT #22992) does not provide any performance benefit when running BIND on versions of the linux kernel that do not include the 'lockless UDP transmit path' changes that were incorporated in 2.6.39. (Some linux distributors may have provided this functionality under their own version numbering systems). Bug Fixes * Fixes the defect introduced by change #3314 that was causing failures when saving stub zones to disk (resulting in excessive CPU usage in some cases). [RT #29952] * The locking strategy around the handling of iterative queries has been tuned to reduce unnecessary contention in a multi-threaded environment. (Note that this may not provide a measurable improvement over previous versions of BIND, but it corrects the performance impact of change 3309 / RT #27995) [RT #29239] * Addresses a race condition that can cause named to to crash when the masters list for a zone is updated via rndc reload/reconfig [RT #26732] * named-checkconf now correctly validates dns64 clients acl definitions. [RT #27631] * Fixes a race condition in zone.c that can cause named to crash during the processing of rndc delzone [RT #29028] * Prevents a named segfault from resolver.c due to procedure fctx_finddone() not being thread-safe. [RT #27995] * Improves DNS64 reverse zone performance. [RT #28563] * Adds wire format lookup method to sdb. [RT #28563] * Uses hmctx, not mctx when freeing rbtdb->heaps to avoid triggering an assertion when flushing cache data. [RT #28571] * Prevents intermittent named crashes following an rndc reload [RT #28606] * Resolves inconsistencies in locating DNSSEC keys where zone names contain characters that require special mappings [RT #28600] * A new flag -R has been added to queryperf for running tests using non-recursive queries. It also now builds correctly on MacOS version 10.7 (darwin) [RT #28565] * Named no longer crashes if gssapi is enabled in named.conf but was not compiled into the binary [RT #28338] * SDB now handles unexpected errors from back-end database drivers gracefully instead of exiting on an assert. [RT #28534] * Prevents named crashes as a result of dereferencing a NULL pointer in zmgr_start_xfrin_ifquota if the zone was being removed while there were zone transfers still pending [RT #28419] * Corrects a parser bug that could cause named to crash while reading a malformed zone file. [RT #28467] * Ensures that when a client recurses its status fields are consistently set so that named doesn't fail on an INSIST in client.c:exit_check. [RT #28346] * Fixed a problem preventing proper use of 64 bit time values in libbind. [RT # 26542] * isccc/cc.c:table_fromwire could fail to free an allocated object on error, leading to a possi