Re: [Pdns-users] Oracle/goracle - bug or my lack of knowledge?
On Mon, 08 Sep 2014, Aki Tuomi wrote: I cannot see you trying these in the pastes you attached. Can you show the full configure line with these? For --with-oracle-libs=path it clearly is: --with-oracle-libs=$ORACLE_HOME/lib (or, the absolute equivalent of that). I guess that's the default. To be honest I've already deleted the (VirtualBox) Ubuntu and Debian instances. I still have the Oracle Linux (dev. version) one. But for --with-oracle-includes= anything I tried made it only fail faster, so whatever I did was wrong anyway. This seems to work: --with-oracle-included=/usr/include/oracle/12.1/client64 But again I guess it's the default, and to me doesn't seem needed. I would also recommend that you use the *oracle* backend instead of *goracle* if you can. Yep; I figured from your previous posts. As oracle was failing I just tried my luck on goracle. Also, where are your headers? You mean Boost headers? Or OCI headers? I see them in $ORACLE_HOME/rdbms/public as required. In $ORACLE_HOME/lib I only see: [root@localhost dbhome_1]# ls lib/ | grep OCI [root@localhost dbhome_1]# ls lib/ | grep oci libocijdbc12.so Or do you mean /usr/src/kernels/3.8.13-16.2.1.el6uek.x86_64/include/linux ...? I guess you want to see the entire output at the end of this message. Which is: [root@localhost pdns-3.3.1]# ./configure --without-lua --with-modules=oracle --enable-cryptopp --with-oracle-included=/usr/include/oracle/12.1/client64 --with-oracle-libs=/u01/app/oracle/product/12.1.0/dbhome_1/lib/ configure: WARNING: unrecognized options: --with-oracle-included checking for a BSD-compatible install... /usr/bin/install -c checking whether build environment is sane... yes checking for a thread-safe mkdir -p... /bin/mkdir -p checking for gawk... gawk checking whether make sets $(MAKE)... yes checking how to create a ustar tar archive... gnutar checking whether make supports nested variables... yes checking build system type... x86_64-unknown-linux-gnu checking host system type... x86_64-unknown-linux-gnu checking how to print strings... printf checking for style of include used by make... GNU checking for gcc... gcc checking whether the C compiler works... yes checking for C compiler default output file name... a.out checking for suffix of executables... checking whether we are cross compiling... no checking for suffix of object files... o checking whether we are using the GNU C compiler... yes checking whether gcc accepts -g... yes checking for gcc option to accept ISO C89... none needed checking dependency style of gcc... gcc3 checking for a sed that does not truncate output... /bin/sed checking for grep that handles long lines and -e... /bin/grep checking for egrep... /bin/grep -E checking for fgrep... /bin/grep -F checking for ld used by gcc... /usr/bin/ld checking if the linker (/usr/bin/ld) is GNU ld... yes checking for BSD- or MS-compatible name lister (nm)... /usr/bin/nm -B checking the name lister (/usr/bin/nm -B) interface... BSD nm checking whether ln -s works... yes checking the maximum length of command line arguments... 1572864 checking whether the shell understands some XSI constructs... yes checking whether the shell understands +=... yes checking how to convert x86_64-unknown-linux-gnu file names to x86_64-unknown-linux-gnu format... func_convert_file_noop checking how to convert x86_64-unknown-linux-gnu file names to toolchain format... func_convert_file_noop checking for /usr/bin/ld option to reload object files... -r checking for objdump... objdump checking how to recognize dependent libraries... pass_all checking for dlltool... no checking how to associate runtime and link libraries... printf %s\n checking for ar... ar checking for archiver @FILE support... @ checking for strip... strip checking for ranlib... ranlib checking command to parse /usr/bin/nm -B output from gcc object... ok checking for sysroot... no checking for mt... no checking if : is a manifest tool... no checking how to run the C preprocessor... gcc -E checking for ANSI C header files... yes checking for sys/types.h... yes checking for sys/stat.h... yes checking for stdlib.h... yes checking for string.h... yes checking for memory.h... yes checking for strings.h... yes checking for inttypes.h... yes checking for stdint.h... yes checking for unistd.h... yes checking for dlfcn.h... yes checking for objdir... .libs checking if gcc supports -fno-rtti -fno-exceptions... no checking for gcc option to produce PIC... -fPIC -DPIC checking if gcc PIC flag -fPIC -DPIC works... yes checking if gcc static flag -static works... no checking if gcc supports -c -o file.o... yes checking if gcc supports -c -o file.o... (cached) yes checking whether the gcc linker (/usr/bin/ld -m elf_x86_64) supports shared libraries... yes checking whether -lc should be explicitly linked in... no checking dynamic linker characteristics... GNU/Linux ld.so checking how to hardcode library paths into programs... immediate checking whether stripping
[Pdns-users] Multiple forward-zone-file
Hi, is it possible to have multiple files in 'forward-zone-files=' (PDNS recursor 3.6)? I want to split the file into smaller parts and tried already some different settings (e.g. separated with ; or , ) but no luck... Regards, Marco ___ Pdns-users mailing list Pdns-users@mailman.powerdns.com http://mailman.powerdns.com/mailman/listinfo/pdns-users
[Pdns-users] Recursion issue--SERVFAIL then NOERROR totally at random
Hey guys, I've been having a problem with recursion. For some reason, certain domains seem to throw SERVFAIL errors when dug most of the time, but then NOERROR with a correct response at other random times. For example: root@yoshi:/# dig toyotasupplier.com ; DiG 9.8.4-rpz2+rl005.12-P1 toyotasupplier.com ;; global options: +cmd ;; Got answer: ;; -HEADER- opcode: QUERY, status: SERVFAIL, id: 2636 ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0 ;; QUESTION SECTION: ;toyotasupplier.com.IN A ;; Query time: 0 msec ;; SERVER: 208.88.248.25#53(208.88.248.25) ;; WHEN: Wed Sep 3 13:36:33 2014 ;; MSG SIZE rcvd: 36 And then, a few hours later: root@yoshi:/# dig toyotasupplier.com ; DiG 9.8.4-rpz2+rl005.12-P1 toyotasupplier.com ;; global options: +cmd ;; Got answer: ;; -HEADER- opcode: QUERY, status: NOERROR, id: 56751 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0 ;; QUESTION SECTION: ;toyotasupplier.com.IN A ;; ANSWER SECTION: toyotasupplier.com. 18296 IN A 12.169.52.71 ;; Query time: 1 msec ;; SERVER: 208.88.248.25#53(208.88.248.25) ;; WHEN: Thu Sep 4 10:39:38 2014 ;; MSG SIZE rcvd: 52 And then, a few hours later still: root@yoshi:/# dig toyotasupplier.com ; DiG 9.8.4-rpz2+rl005.12-P1 toyotasupplier.com ;; global options: +cmd ;; Got answer: ;; -HEADER- opcode: QUERY, status: SERVFAIL, id: 5171 ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0 ;; QUESTION SECTION: ;toyotasupplier.com.IN A ;; Query time: 3017 msec ;; SERVER: 208.88.248.25#53(208.88.248.25) ;; WHEN: Fri Sep 5 07:50:25 2014 ;; MSG SIZE rcvd: 36 All without making a single change. I have been working on debugging this for two days now and absolutely cannot pinpoint a source for the issue. I've increased the max query lengths, the recursor's network and client TCP timeouts, restarted the service several times on several of our DNS servers, and nothing I do seems to fix it. It of course doesn't help that the bug is a bit of a gremlin and keeps mischievously disappearing at random (and in fact never, to my knowledge, happened before until about a week ago, when it started to occur for no apparent reason). Any idea on what could be causing this? FWIW, when I run dig toyotasupplier.com ns it consistently works fine: root@yoshi:/# dig toyotasupplier.com ns ; DiG 9.8.4-rpz2+rl005.12-P1 toyotasupplier.com ns ;; global options: +cmd ;; Got answer: ;; -HEADER- opcode: QUERY, status: NOERROR, id: 39522 ;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 0 ;; QUESTION SECTION: ;toyotasupplier.com.IN NS ;; ANSWER SECTION: toyotasupplier.com. 50741 IN NS gslb-ns2.toyota-na.com. toyotasupplier.com. 50741 IN NS gslb-ns1.toyota-na.com. ;; Query time: 1 msec ;; SERVER: 208.88.248.25#53(208.88.248.25) ;; WHEN: Fri Sep 5 07:49:29 2014 ;; MSG SIZE rcvd: 92 Many thanks in advance, Todd W. Smith IP Services Technician 2331 East 600 North Greenfield, IN 46140 (317) 323-2021 tsm...@ninestarconnect.commailto:tsm...@ninestarconnect.com www.ninestarconnect.comhttp://www.ninestarconnect.com/ ___ Pdns-users mailing list Pdns-users@mailman.powerdns.com http://mailman.powerdns.com/mailman/listinfo/pdns-users
Re: [Pdns-users] Recursion issue--SERVFAIL then NOERROR totally at random
I'd say it's on Toyota's end: $ dig toyotasupplier.com +short @gslb-ns1.toyota-na.com DiG 9.7.3 toyotasupplier.com +short @gslb-ns1.toyota-na.com ;; global options: +cmd connection timed out; no servers could be reached Their other DNS server works fine... several attempts to reach the first one however fails (haven't gotten a success yet). I'd say it's their problem. - Brian Menges Principal Engineer, DevOps @ GoGrid, LLC. From: pdns-users-boun...@mailman.powerdns.com [mailto:pdns-users-boun...@mailman.powerdns.com] On Behalf Of Todd Smith Sent: Tuesday, September 09, 2014 9:24 AM To: 'pdns-users@mailman.powerdns.com' Subject: [Pdns-users] Recursion issue--SERVFAIL then NOERROR totally at random Hey guys, I've been having a problem with recursion. For some reason, certain domains seem to throw SERVFAIL errors when dug most of the time, but then NOERROR with a correct response at other random times. For example: root@yoshi:/# dig toyotasupplier.com ; DiG 9.8.4-rpz2+rl005.12-P1 toyotasupplier.com ;; global options: +cmd ;; Got answer: ;; -HEADER- opcode: QUERY, status: SERVFAIL, id: 2636 ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0 ;; QUESTION SECTION: ;toyotasupplier.com.IN A ;; Query time: 0 msec ;; SERVER: 208.88.248.25#53(208.88.248.25) ;; WHEN: Wed Sep 3 13:36:33 2014 ;; MSG SIZE rcvd: 36 And then, a few hours later: root@yoshi:/# dig toyotasupplier.com ; DiG 9.8.4-rpz2+rl005.12-P1 toyotasupplier.com ;; global options: +cmd ;; Got answer: ;; -HEADER- opcode: QUERY, status: NOERROR, id: 56751 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0 ;; QUESTION SECTION: ;toyotasupplier.com.IN A ;; ANSWER SECTION: toyotasupplier.com. 18296 IN A 12.169.52.71 ;; Query time: 1 msec ;; SERVER: 208.88.248.25#53(208.88.248.25) ;; WHEN: Thu Sep 4 10:39:38 2014 ;; MSG SIZE rcvd: 52 And then, a few hours later still: root@yoshi:/# dig toyotasupplier.com ; DiG 9.8.4-rpz2+rl005.12-P1 toyotasupplier.com ;; global options: +cmd ;; Got answer: ;; -HEADER- opcode: QUERY, status: SERVFAIL, id: 5171 ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0 ;; QUESTION SECTION: ;toyotasupplier.com.IN A ;; Query time: 3017 msec ;; SERVER: 208.88.248.25#53(208.88.248.25) ;; WHEN: Fri Sep 5 07:50:25 2014 ;; MSG SIZE rcvd: 36 All without making a single change. I have been working on debugging this for two days now and absolutely cannot pinpoint a source for the issue. I've increased the max query lengths, the recursor's network and client TCP timeouts, restarted the service several times on several of our DNS servers, and nothing I do seems to fix it. It of course doesn't help that the bug is a bit of a gremlin and keeps mischievously disappearing at random (and in fact never, to my knowledge, happened before until about a week ago, when it started to occur for no apparent reason). Any idea on what could be causing this? FWIW, when I run dig toyotasupplier.com ns it consistently works fine: root@yoshi:/# dig toyotasupplier.com ns ; DiG 9.8.4-rpz2+rl005.12-P1 toyotasupplier.com ns ;; global options: +cmd ;; Got answer: ;; -HEADER- opcode: QUERY, status: NOERROR, id: 39522 ;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 0 ;; QUESTION SECTION: ;toyotasupplier.com.IN NS ;; ANSWER SECTION: toyotasupplier.com. 50741 IN NS gslb-ns2.toyota-na.com. toyotasupplier.com. 50741 IN NS gslb-ns1.toyota-na.com. ;; Query time: 1 msec ;; SERVER: 208.88.248.25#53(208.88.248.25) ;; WHEN: Fri Sep 5 07:49:29 2014 ;; MSG SIZE rcvd: 92 Many thanks in advance, Todd W. Smith IP Services Technician 2331 East 600 North Greenfield, IN 46140 (317) 323-2021 tsm...@ninestarconnect.commailto:tsm...@ninestarconnect.com www.ninestarconnect.comhttp://www.ninestarconnect.com/ The information contained in this message, and any attachments, may contain confidential and legally privileged material. It is solely for the use of the person or entity to which it is addressed. Any review, retransmission, dissemination, or action taken in reliance upon this information by persons or entities other than the intended recipient is prohibited. If you receive this in error, please contact the sender and delete the material from any computer. ___ Pdns-users mailing list Pdns-users@mailman.powerdns.com http://mailman.powerdns.com/mailman/listinfo/pdns-users
Re: [Pdns-users] Recursion issue--SERVFAIL then NOERROR totally at random
Hey Brian, That would make perfect sense, and I was thinking along similar lines, but if that's the case, why do I get a consistent NOERROR when using Google DNS? Google's cache perhaps? root@yoshi:/# dig toyotasupplier.com @8.8.8.8 ; DiG 9.8.4-rpz2+rl005.12-P1 toyotasupplier.com @8.8.8.8 ;; global options: +cmd ;; Got answer: ;; -HEADER- opcode: QUERY, status: NOERROR, id: 35779 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0 ;; QUESTION SECTION: ;toyotasupplier.com.IN A ;; ANSWER SECTION: toyotasupplier.com. 21594 IN A 12.169.52.71 ;; Query time: 30 msec ;; SERVER: 8.8.8.8#53(8.8.8.8) ;; WHEN: Tue Sep 9 12:34:43 2014 ;; MSG SIZE rcvd: 52 root@yoshi:/# dig toyotasupplier.com @208.88.248.27 ; DiG 9.8.4-rpz2+rl005.12-P1 toyotasupplier.com @208.88.248.27 ;; global options: +cmd ;; Got answer: ;; -HEADER- opcode: QUERY, status: SERVFAIL, id: 29841 ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0 ;; QUESTION SECTION: ;toyotasupplier.com.IN A ;; Query time: 49 msec ;; SERVER: 208.88.248.27#53(208.88.248.27) ;; WHEN: Tue Sep 9 12:35:02 2014 ;; MSG SIZE rcvd: 36 -T From: pdns-users-boun...@mailman.powerdns.com [mailto:pdns-users-boun...@mailman.powerdns.com] On Behalf Of Brian Menges Sent: Tuesday, September 09, 2014 12:56 PM To: 'pdns-users@mailman.powerdns.com' Subject: Re: [Pdns-users] Recursion issue--SERVFAIL then NOERROR totally at random I'd say it's on Toyota's end: $ dig toyotasupplier.com +short @gslb-ns1.toyota-na.com DiG 9.7.3 toyotasupplier.com +short @gslb-ns1.toyota-na.com ;; global options: +cmd connection timed out; no servers could be reached Their other DNS server works fine... several attempts to reach the first one however fails (haven't gotten a success yet). I'd say it's their problem. - Brian Menges Principal Engineer, DevOps @ GoGrid, LLC. From: pdns-users-boun...@mailman.powerdns.commailto:pdns-users-boun...@mailman.powerdns.com [mailto:pdns-users-boun...@mailman.powerdns.com] On Behalf Of Todd Smith Sent: Tuesday, September 09, 2014 9:24 AM To: 'pdns-users@mailman.powerdns.com' Subject: [Pdns-users] Recursion issue--SERVFAIL then NOERROR totally at random Hey guys, I've been having a problem with recursion. For some reason, certain domains seem to throw SERVFAIL errors when dug most of the time, but then NOERROR with a correct response at other random times. For example: root@yoshi:/# dig toyotasupplier.com ; DiG 9.8.4-rpz2+rl005.12-P1 toyotasupplier.com ;; global options: +cmd ;; Got answer: ;; -HEADER- opcode: QUERY, status: SERVFAIL, id: 2636 ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0 ;; QUESTION SECTION: ;toyotasupplier.com.IN A ;; Query time: 0 msec ;; SERVER: 208.88.248.25#53(208.88.248.25) ;; WHEN: Wed Sep 3 13:36:33 2014 ;; MSG SIZE rcvd: 36 And then, a few hours later: root@yoshi:/# dig toyotasupplier.com ; DiG 9.8.4-rpz2+rl005.12-P1 toyotasupplier.com ;; global options: +cmd ;; Got answer: ;; -HEADER- opcode: QUERY, status: NOERROR, id: 56751 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0 ;; QUESTION SECTION: ;toyotasupplier.com.IN A ;; ANSWER SECTION: toyotasupplier.com. 18296 IN A 12.169.52.71 ;; Query time: 1 msec ;; SERVER: 208.88.248.25#53(208.88.248.25) ;; WHEN: Thu Sep 4 10:39:38 2014 ;; MSG SIZE rcvd: 52 And then, a few hours later still: root@yoshi:/# dig toyotasupplier.com ; DiG 9.8.4-rpz2+rl005.12-P1 toyotasupplier.com ;; global options: +cmd ;; Got answer: ;; -HEADER- opcode: QUERY, status: SERVFAIL, id: 5171 ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0 ;; QUESTION SECTION: ;toyotasupplier.com.IN A ;; Query time: 3017 msec ;; SERVER: 208.88.248.25#53(208.88.248.25) ;; WHEN: Fri Sep 5 07:50:25 2014 ;; MSG SIZE rcvd: 36 All without making a single change. I have been working on debugging this for two days now and absolutely cannot pinpoint a source for the issue. I've increased the max query lengths, the recursor's network and client TCP timeouts, restarted the service several times on several of our DNS servers, and nothing I do seems to fix it. It of course doesn't help that the bug is a bit of a gremlin and keeps mischievously disappearing at random (and in fact never, to my knowledge, happened before until about a week ago, when it started to occur for no apparent reason). Any idea on what could be causing this? FWIW, when I run dig toyotasupplier.com ns it consistently works fine: root@yoshi:/# dig toyotasupplier.com ns ; DiG 9.8.4-rpz2+rl005.12-P1 toyotasupplier.com ns ;; global options: +cmd ;; Got answer: ;; -HEADER- opcode: QUERY, status: NOERROR, id: 39522 ;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 0 ;; QUESTION SECTION:
Re: [Pdns-users] Recursion issue--SERVFAIL then NOERROR totally at random
On Tue, Sep 9, 2014 at 9:55 AM, Brian Menges bmen...@gogrid.com wrote: I’d say it’s on Toyota’s end: Same here gslb-ns1.toyota-na.com not responding (Comcast, Seattle, WA) ___ Pdns-users mailing list Pdns-users@mailman.powerdns.com http://mailman.powerdns.com/mailman/listinfo/pdns-users
Re: [Pdns-users] Recursion issue--SERVFAIL then NOERROR totally at random
Well, as long as one server works you should get an answer. Try setting a trace-regex on toyota and see what your powerdns reports! http://doc.powerdns.com/html/rec-control.html - trace-regex Bert On Tue, Sep 09, 2014 at 10:06:03AM -0700, Michael Loftis wrote: On Tue, Sep 9, 2014 at 9:55 AM, Brian Menges bmen...@gogrid.com wrote: I’d say it’s on Toyota’s end: Same here gslb-ns1.toyota-na.com not responding (Comcast, Seattle, WA) ___ Pdns-users mailing list Pdns-users@mailman.powerdns.com http://mailman.powerdns.com/mailman/listinfo/pdns-users ___ Pdns-users mailing list Pdns-users@mailman.powerdns.com http://mailman.powerdns.com/mailman/listinfo/pdns-users
Re: [Pdns-users] Recursion issue--SERVFAIL then NOERROR totally at random
Rather long output here; however, it certainly looks like these results pretty much confirm that the issue is Toyota's, not ours: Sep 9 12:47:07 yoshi pdns_recursor[31821]: 1 [10690756] question for 'toyotasupplier.com.|A' from 208.88.248.27 Sep 9 12:47:12 yoshi pdns_recursor[31821]: 0 [3961638] question for 'toyotasupplier.com.|A' from 208.88.248.27 Sep 9 12:47:17 yoshi pdns_recursor[31821]: 1 [10691043] question for 'toyotasupplier.com.|A' from 208.88.248.27 Sep 9 12:47:17 yoshi pdns_recursor[31821]: [10690756] toyotasupplier.com.: Looking for CNAME cache hit of 'toyotasupplier.com.|CNAME' Sep 9 12:47:17 yoshi pdns_recursor[31821]: [10690756] toyotasupplier.com.: No CNAME cache hit of 'toyotasupplier.com.|CNAME' found Sep 9 12:47:17 yoshi pdns_recursor[31821]: [10690756] toyotasupplier.com.: No cache hit for 'toyotasupplier.com.|A', trying to find an appropriate NS record Sep 9 12:47:17 yoshi pdns_recursor[31821]: [10690756] toyotasupplier.com.: Checking if we have NS in cache for 'toyotasupplier.com.' Sep 9 12:47:17 yoshi pdns_recursor[31821]: [10690756] toyotasupplier.com.: NS (with ip, or non-glue) in cache for 'toyotasupplier.com.' - 'gslb-ns1.toyota-na.com.' Sep 9 12:47:17 yoshi pdns_recursor[31821]: [10690756] toyotasupplier.com.: within bailiwick: 0, not in cache / did not look at cache Sep 9 12:47:17 yoshi pdns_recursor[31821]: [10690756] toyotasupplier.com.: NS (with ip, or non-glue) in cache for 'toyotasupplier.com.' - 'gslb-ns2.toyota-na.com.' Sep 9 12:47:17 yoshi pdns_recursor[31821]: [10690756] toyotasupplier.com.: within bailiwick: 0, not in cache / did not look at cache Sep 9 12:47:17 yoshi pdns_recursor[31821]: [10690756] toyotasupplier.com.: We have NS in cache for 'toyotasupplier.com.' (flawedNSSet=0) Sep 9 12:47:17 yoshi pdns_recursor[31821]: [10690756] toyotasupplier.com.: Cache consultations done, have 2 NS to contact Sep 9 12:47:17 yoshi pdns_recursor[31821]: [10690756] toyotasupplier.com.: Nameservers: gslb-ns1.toyota-na.com.(0.00ms), gslb-ns2.toyota-na.com.(0.00ms) Sep 9 12:47:17 yoshi pdns_recursor[31821]: [10690756] toyotasupplier.com.: Trying to resolve NS 'gslb-ns1.toyota-na.com.' (1/2) Sep 9 12:47:17 yoshi pdns_recursor[31821]: [10690756] gslb-ns1.toyota-na.com.: Looking for CNAME cache hit of 'gslb-ns1.toyota-na.com.|CNAME' Sep 9 12:47:17 yoshi pdns_recursor[31821]: [10690756] gslb-ns1.toyota-na.com.: No CNAME cache hit of 'gslb-ns1.toyota-na.com.|CNAME' found Sep 9 12:47:17 yoshi pdns_recursor[31821]: [10690756] gslb-ns1.toyota-na.com.: Found cache hit for A: 63.238.139.235[ttl=80545] Sep 9 12:47:17 yoshi pdns_recursor[31821]: [10690756] toyotasupplier.com.: Resolved 'toyotasupplier.com.' NS gslb-ns1.toyota-na.com. to: 63.238.139.235 Sep 9 12:47:17 yoshi pdns_recursor[31821]: [10690756] toyotasupplier.com.: Trying IP 63.238.139.235:53, asking 'toyotasupplier.com.|A' Sep 9 12:47:17 yoshi pdns_recursor[31821]: [10690756] toyotasupplier.com.: timeout resolving Sep 9 12:47:17 yoshi pdns_recursor[31821]: [10690756] toyotasupplier.com.: Trying to resolve NS 'gslb-ns2.toyota-na.com.' (2/2) Sep 9 12:47:17 yoshi pdns_recursor[31821]: [10690756] gslb-ns2.toyota-na.com.: Looking for CNAME cache hit of 'gslb-ns2.toyota-na.com.|CNAME' Sep 9 12:47:17 yoshi pdns_recursor[31821]: [10690756] gslb-ns2.toyota-na.com.: No CNAME cache hit of 'gslb-ns2.toyota-na.com.|CNAME' found Sep 9 12:47:17 yoshi pdns_recursor[31821]: [10690756] gslb-ns2.toyota-na.com.: Found cache hit for A: 12.169.52.62[ttl=80540] Sep 9 12:47:17 yoshi pdns_recursor[31821]: [10690756] toyotasupplier.com.: Resolved 'toyotasupplier.com.' NS gslb-ns2.toyota-na.com. to: 12.169.52.62 Sep 9 12:47:17 yoshi pdns_recursor[31821]: [10690756] toyotasupplier.com.: Trying IP 12.169.52.62:53, asking 'toyotasupplier.com.|A' Sep 9 12:47:17 yoshi pdns_recursor[31821]: [10690756] toyotasupplier.com.: timeout resolving Sep 9 12:47:17 yoshi pdns_recursor[31821]: [10690756] toyotasupplier.com.: Failed to resolve via any of the 2 offered NS at level 'toyotasupplier.com.' Sep 9 12:47:17 yoshi pdns_recursor[31821]: [10690756] toyotasupplier.com.: failed (res=-1) Sep 9 12:47:22 yoshi pdns_recursor[31821]: [3961638] toyotasupplier.com.: Looking for CNAME cache hit of 'toyotasupplier.com.|CNAME' Sep 9 12:47:22 yoshi pdns_recursor[31821]: [3961638] toyotasupplier.com.: No CNAME cache hit of 'toyotasupplier.com.|CNAME' found Sep 9 12:47:22 yoshi pdns_recursor[31821]: [3961638] toyotasupplier.com.: No cache hit for 'toyotasupplier.com.|A', trying to find an appropriate NS record Sep 9 12:47:22 yoshi pdns_recursor[31821]: [3961638] toyotasupplier.com.: Checking if we have NS in cache for 'toyotasupplier.com.' Sep 9 12:47:22 yoshi pdns_recursor[31821]: [3961638] toyotasupplier.com.: NS (with ip, or non-glue) in cache for 'toyotasupplier.com.' - 'gslb-ns1.toyota-na.com.' Sep 9 12:47:22 yoshi pdns_recursor[31821]: [3961638] toyotasupplier.com.: within
Re: [Pdns-users] Recursion issue--SERVFAIL then NOERROR totally at random
On Tue, Sep 09, 2014 at 05:16:24PM +, Todd Smith wrote: Rather long output here; however, it certainly looks like these results pretty much confirm that the issue is Toyota's, not ours: Sep 9 12:47:17 yoshi pdns_recursor[31821]: [10690756] toyotasupplier.com.: Resolved 'toyotasupplier.com.' NS gslb-ns1.toyota-na.com. to: 63.238.139.235 Sep 9 12:47:17 yoshi pdns_recursor[31821]: [10690756] toyotasupplier.com.: Trying IP 63.238.139.235:53, asking 'toyotasupplier.com.|A' Sep 9 12:47:17 yoshi pdns_recursor[31821]: [10690756] toyotasupplier.com.: timeout resolving Sep 9 12:47:17 yoshi pdns_recursor[31821]: [10690756] toyotasupplier.com.: Trying IP 12.169.52.62:53, asking 'toyotasupplier.com.|A' Sep 9 12:47:17 yoshi pdns_recursor[31821]: [10690756] toyotasupplier.com.: timeout resolving Sep 9 12:47:17 yoshi pdns_recursor[31821]: [10690756] toyotasupplier.com.: Failed to resolve via any of the 2 offered NS at level 'toyotasupplier.com.' Sep 9 12:47:17 yoshi pdns_recursor[31821]: [10690756] toyotasupplier.com.: failed (res=-1) You could try a traceroute to these two addresses to debug, but this indeed does not look like a powerdns issue but more a networking issue! Note that your timeout does appear to 1 second, you could try raising this with 'network-timeout=2000' and see if this helps (2 seconds). Good luck! Bert -Original Message- From: pdns-users-boun...@mailman.powerdns.com [mailto:pdns-users-boun...@mailman.powerdns.com] On Behalf Of bert hubert Sent: Tuesday, September 09, 2014 1:11 PM To: Michael Loftis Cc: pdns-users@mailman.powerdns.com Subject: Re: [Pdns-users] Recursion issue--SERVFAIL then NOERROR totally at random Well, as long as one server works you should get an answer. Try setting a trace-regex on toyota and see what your powerdns reports! http://doc.powerdns.com/html/rec-control.html - trace-regex Bert On Tue, Sep 09, 2014 at 10:06:03AM -0700, Michael Loftis wrote: On Tue, Sep 9, 2014 at 9:55 AM, Brian Menges bmen...@gogrid.com wrote: I’d say it’s on Toyota’s end: Same here gslb-ns1.toyota-na.com not responding (Comcast, Seattle, WA) ___ Pdns-users mailing list Pdns-users@mailman.powerdns.com http://mailman.powerdns.com/mailman/listinfo/pdns-users ___ Pdns-users mailing list Pdns-users@mailman.powerdns.com http://mailman.powerdns.com/mailman/listinfo/pdns-users ___ Pdns-users mailing list Pdns-users@mailman.powerdns.com http://mailman.powerdns.com/mailman/listinfo/pdns-users ___ Pdns-users mailing list Pdns-users@mailman.powerdns.com http://mailman.powerdns.com/mailman/listinfo/pdns-users
Re: [Pdns-users] Recursion issue--SERVFAIL then NOERROR totally at random
Actually that begs one more question--as of right now I actually have network-timeout set to 5000 in recursor.conf, yet obviously it's still timing out considerably sooner than that; is there, say, some other setting (that is, of course, within PowerDNS) that might be conflicting with this causing it time out sooner? If not (as I suspect), I'll investigate our network settings to see if anything else might be clipping these requests off short. Many many thanks again -T -Original Message- From: bert hubert [mailto:bert.hub...@netherlabs.nl] Sent: Tuesday, September 09, 2014 1:39 PM To: Todd Smith Cc: 'pdns-users@mailman.powerdns.com' Subject: Re: [Pdns-users] Recursion issue--SERVFAIL then NOERROR totally at random On Tue, Sep 09, 2014 at 05:16:24PM +, Todd Smith wrote: Rather long output here; however, it certainly looks like these results pretty much confirm that the issue is Toyota's, not ours: Sep 9 12:47:17 yoshi pdns_recursor[31821]: [10690756] toyotasupplier.com.: Resolved 'toyotasupplier.com.' NS gslb-ns1.toyota-na.com. to: 63.238.139.235 Sep 9 12:47:17 yoshi pdns_recursor[31821]: [10690756] toyotasupplier.com.: Trying IP 63.238.139.235:53, asking 'toyotasupplier.com.|A' Sep 9 12:47:17 yoshi pdns_recursor[31821]: [10690756] toyotasupplier.com.: timeout resolving Sep 9 12:47:17 yoshi pdns_recursor[31821]: [10690756] toyotasupplier.com.: Trying IP 12.169.52.62:53, asking 'toyotasupplier.com.|A' Sep 9 12:47:17 yoshi pdns_recursor[31821]: [10690756] toyotasupplier.com.: timeout resolving Sep 9 12:47:17 yoshi pdns_recursor[31821]: [10690756] toyotasupplier.com.: Failed to resolve via any of the 2 offered NS at level 'toyotasupplier.com.' Sep 9 12:47:17 yoshi pdns_recursor[31821]: [10690756] toyotasupplier.com.: failed (res=-1) You could try a traceroute to these two addresses to debug, but this indeed does not look like a powerdns issue but more a networking issue! Note that your timeout does appear to 1 second, you could try raising this with 'network-timeout=2000' and see if this helps (2 seconds). Good luck! Bert ___ Pdns-users mailing list Pdns-users@mailman.powerdns.com http://mailman.powerdns.com/mailman/listinfo/pdns-users ___ Pdns-users mailing list Pdns-users@mailman.powerdns.com http://mailman.powerdns.com/mailman/listinfo/pdns-users
Re: [Pdns-users] Recursion issue--SERVFAIL then NOERROR totally at random
On Tue, Sep 09, 2014 at 05:57:08PM +, Todd Smith wrote: Actually that begs one more question--as of right now I actually have network-timeout set to 5000 in recursor.conf, yet obviously it's still timing out considerably sooner than that; is there, say, some other setting (that is, of course, within PowerDNS) that might be conflicting with this causing it time out sooner? Just just checked, 3.6.0 honors network-timeout correctly under normal conditions, seeing if there are possibilities when we might be short circuiting it. Bert ___ Pdns-users mailing list Pdns-users@mailman.powerdns.com http://mailman.powerdns.com/mailman/listinfo/pdns-users
Re: [Pdns-users] Recursion issue--SERVFAIL then NOERROR totally at random
On Tue, Sep 09, 2014 at 08:20:48PM +0200, bert hubert wrote: On Tue, Sep 09, 2014 at 05:57:08PM +, Todd Smith wrote: Actually that begs one more question--as of right now I actually have network-timeout set to 5000 in recursor.conf, yet obviously it's still timing out considerably sooner than that; is there, say, some other setting (that is, of course, within PowerDNS) that might be conflicting with this causing it time out sooner? Just just checked, 3.6.0 honors network-timeout correctly under normal conditions, seeing if there are possibilities when we might be short circuiting it. Ok, this has to do with the nature of trace-regex, which outputs the whole thing at the end of the resolve process, with only one timestamp. Clarified the output a bit to: Sep 09 20:28:50 [13281] toyotasupplier.com.: Resolved 'toyotasupplier.com.' NS gslb-ns1.toyota-na.com. to: 63.238.139.235 Sep 09 20:28:50 [13281] toyotasupplier.com.: Trying IP 63.238.139.235:53, asking 'toyotasupplier.com.|MX' Sep 09 20:28:50 [13281] toyotasupplier.com.: timeout resolving after 5000.47msec Sep 09 20:28:50 [13281] toyotasupplier.com.: Trying to resolve NS 'gslb-ns2.toyota-na.com.' (2/2) https://github.com/PowerDNS/pdns/commit/863ca18dd298ad0f2ee377aaf539450bc81e0b0a I also get intermittent failures of toyotasupplier.com here by the way, so it isn't just you! Bert ___ Pdns-users mailing list Pdns-users@mailman.powerdns.com http://mailman.powerdns.com/mailman/listinfo/pdns-users
Re: [Pdns-users] Recursion issue--SERVFAIL then NOERROR totally at random
I'd say google is talking to the one that answers, and caches that. 63.238.139.235 (gslb-ns1.toyota-na.com) definitely has issues - Brian Menges Principal Engineer, DevOps @ GoGrid, LLC. From: pdns-users-boun...@mailman.powerdns.com [mailto:pdns-users-boun...@mailman.powerdns.com] On Behalf Of Todd Smith Sent: Tuesday, September 09, 2014 10:04 AM To: 'pdns-users@mailman.powerdns.com' Subject: Re: [Pdns-users] Recursion issue--SERVFAIL then NOERROR totally at random Hey Brian, That would make perfect sense, and I was thinking along similar lines, but if that's the case, why do I get a consistent NOERROR when using Google DNS? Google's cache perhaps? root@yoshi:/# dig toyotasupplier.com @8.8.8.8 ; DiG 9.8.4-rpz2+rl005.12-P1 toyotasupplier.com @8.8.8.8 ;; global options: +cmd ;; Got answer: ;; -HEADER- opcode: QUERY, status: NOERROR, id: 35779 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0 ;; QUESTION SECTION: ;toyotasupplier.com.IN A ;; ANSWER SECTION: toyotasupplier.com. 21594 IN A 12.169.52.71 ;; Query time: 30 msec ;; SERVER: 8.8.8.8#53(8.8.8.8) ;; WHEN: Tue Sep 9 12:34:43 2014 ;; MSG SIZE rcvd: 52 root@yoshi:/# dig toyotasupplier.com @208.88.248.27 ; DiG 9.8.4-rpz2+rl005.12-P1 toyotasupplier.com @208.88.248.27 ;; global options: +cmd ;; Got answer: ;; -HEADER- opcode: QUERY, status: SERVFAIL, id: 29841 ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0 ;; QUESTION SECTION: ;toyotasupplier.com.IN A ;; Query time: 49 msec ;; SERVER: 208.88.248.27#53(208.88.248.27) ;; WHEN: Tue Sep 9 12:35:02 2014 ;; MSG SIZE rcvd: 36 -T From: pdns-users-boun...@mailman.powerdns.commailto:pdns-users-boun...@mailman.powerdns.com [mailto:pdns-users-boun...@mailman.powerdns.com] On Behalf Of Brian Menges Sent: Tuesday, September 09, 2014 12:56 PM To: 'pdns-users@mailman.powerdns.com' Subject: Re: [Pdns-users] Recursion issue--SERVFAIL then NOERROR totally at random I'd say it's on Toyota's end: $ dig toyotasupplier.com +short @gslb-ns1.toyota-na.com DiG 9.7.3 toyotasupplier.com +short @gslb-ns1.toyota-na.com ;; global options: +cmd connection timed out; no servers could be reached Their other DNS server works fine... several attempts to reach the first one however fails (haven't gotten a success yet). I'd say it's their problem. - Brian Menges Principal Engineer, DevOps @ GoGrid, LLC. From: pdns-users-boun...@mailman.powerdns.commailto:pdns-users-boun...@mailman.powerdns.com [mailto:pdns-users-boun...@mailman.powerdns.com] On Behalf Of Todd Smith Sent: Tuesday, September 09, 2014 9:24 AM To: 'pdns-users@mailman.powerdns.com' Subject: [Pdns-users] Recursion issue--SERVFAIL then NOERROR totally at random Hey guys, I've been having a problem with recursion. For some reason, certain domains seem to throw SERVFAIL errors when dug most of the time, but then NOERROR with a correct response at other random times. For example: root@yoshi:/# dig toyotasupplier.com ; DiG 9.8.4-rpz2+rl005.12-P1 toyotasupplier.com ;; global options: +cmd ;; Got answer: ;; -HEADER- opcode: QUERY, status: SERVFAIL, id: 2636 ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0 ;; QUESTION SECTION: ;toyotasupplier.com.IN A ;; Query time: 0 msec ;; SERVER: 208.88.248.25#53(208.88.248.25) ;; WHEN: Wed Sep 3 13:36:33 2014 ;; MSG SIZE rcvd: 36 And then, a few hours later: root@yoshi:/# dig toyotasupplier.com ; DiG 9.8.4-rpz2+rl005.12-P1 toyotasupplier.com ;; global options: +cmd ;; Got answer: ;; -HEADER- opcode: QUERY, status: NOERROR, id: 56751 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0 ;; QUESTION SECTION: ;toyotasupplier.com.IN A ;; ANSWER SECTION: toyotasupplier.com. 18296 IN A 12.169.52.71 ;; Query time: 1 msec ;; SERVER: 208.88.248.25#53(208.88.248.25) ;; WHEN: Thu Sep 4 10:39:38 2014 ;; MSG SIZE rcvd: 52 And then, a few hours later still: root@yoshi:/# dig toyotasupplier.com ; DiG 9.8.4-rpz2+rl005.12-P1 toyotasupplier.com ;; global options: +cmd ;; Got answer: ;; -HEADER- opcode: QUERY, status: SERVFAIL, id: 5171 ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0 ;; QUESTION SECTION: ;toyotasupplier.com.IN A ;; Query time: 3017 msec ;; SERVER: 208.88.248.25#53(208.88.248.25) ;; WHEN: Fri Sep 5 07:50:25 2014 ;; MSG SIZE rcvd: 36 All without making a single change. I have been working on debugging this for two days now and absolutely cannot pinpoint a source for the issue. I've increased the max query lengths, the recursor's network and client TCP timeouts, restarted the service several times on several of our DNS servers, and nothing I do seems to fix it. It of course doesn't help that the bug is a bit of a gremlin and keeps mischievously disappearing at