Hi, We've noticed some weirdness regarding CNAME's on pdns-2.9.22.
Here's the thing : We use the gpgsql backend. When there is a domain that has a CNAME record pointing to an external domain (eg. of which the DNS is not hosted by our nameservers) and where the record is an A-record or a CNAME-record (eg. ghs.l.google.com or ghs.google.com), we get different results when we perform a lookup with dig. (We started looking into this issue after we got some complaints from customers that their clients sometimes could not visit their site that used such a CNAME record). This is what (I think) we should expect as a result for a dig query without recursion : * status: NOERROR & flags: qr aa * result: cname-test1.as12573.net. IN CNAME ghs.google.com. And this for a query with recursion (which is denied on our auth servers) * status: SERVFAIL & flags: qr rd * result: cname-test1.as12573.net. IN CNAME ghs.google.com. When we execute for example : for i in `gseq 1 50`; do echo "Executing dig#$i"; sleep 0.5; dig -4 +norec -t A @ns2.widexs.net cname-test1.as12573.net.; done we see the following happening : Executing dig#16 ; <<>> DiG 9.6.1-P1 <<>> -4 +norec -t A @ns2.widexs.net cname-test1.as12573.net. ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 22048 ;; flags: qr; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0 ;; QUESTION SECTION: ;cname-test1.as12573.net. IN A ;; ANSWER SECTION: cname-test1.as12573.net. 3600 IN CNAME ghs.google.com. ;; Query time: 1 msec ;; SERVER: 212.204.207.192#53(212.204.207.192) ;; WHEN: Fri Jul 9 00:41:03 2010 ;; MSG SIZE rcvd: 69 Executing dig#17 ; <<>> DiG 9.6.1-P1 <<>> -4 +norec -t A @ns2.widexs.net cname-test1.as12573.net. ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 61188 ;; flags: qr aa; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0 ;; QUESTION SECTION: ;cname-test1.as12573.net. IN A ;; ANSWER SECTION: cname-test1.as12573.net. 3600 IN CNAME ghs.google.com. ;; Query time: 2 msec ;; SERVER: 212.204.207.192#53(212.204.207.192) ;; WHEN: Fri Jul 9 00:41:03 2010 ;; MSG SIZE rcvd: 69 Executing dig#18 ; <<>> DiG 9.6.1-P1 <<>> -4 +norec -t A @ns2.widexs.net cname-test1.as12573.net. ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 58857 ;; flags: qr aa; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0 ;; QUESTION SECTION: ;cname-test1.as12573.net. IN A ;; ANSWER SECTION: cname-test1.as12573.net. 3600 IN CNAME ghs.google.com. ;; Query time: 1 msec ;; SERVER: 212.204.207.192#53(212.204.207.192) ;; WHEN: Fri Jul 9 00:41:04 2010 ;; MSG SIZE rcvd: 69 And vice versa while executing for i in `gseq 1 50`; do echo "Executing dig#$i"; sleep 0.5; dig -4 +norec -t A @ns2.widexs.net cname-test1.as12573.net.; done we see : Executing dig#16 ; <<>> DiG 9.6.1-P1 <<>> -4 +rec -t A @ns2.widexs.net cname-test1.as12573.net. ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 17904 ;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0 ;; WARNING: recursion requested but not available ;; QUESTION SECTION: ;cname-test1.as12573.net. IN A ;; ANSWER SECTION: cname-test1.as12573.net. 3600 IN CNAME ghs.google.com. ;; Query time: 1 msec ;; SERVER: 212.204.207.192#53(212.204.207.192) ;; WHEN: Fri Jul 9 00:46:43 2010 ;; MSG SIZE rcvd: 69 Executing dig#17 ; <<>> DiG 9.6.1-P1 <<>> -4 +rec -t A @ns2.widexs.net cname-test1.as12573.net. ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 52573 ;; flags: qr rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0 ;; WARNING: recursion requested but not available ;; QUESTION SECTION: ;cname-test1.as12573.net. IN A ;; ANSWER SECTION: cname-test1.as12573.net. 3600 IN CNAME ghs.google.com. ;; Query time: 2 msec ;; SERVER: 212.204.207.192#53(212.204.207.192) ;; WHEN: Fri Jul 9 00:46:43 2010 ;; MSG SIZE rcvd: 69 Executing dig#18 ; <<>> DiG 9.6.1-P1 <<>> -4 +rec -t A @ns2.widexs.net cname-test1.as12573.net. ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 7623 ;; flags: qr rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0 ;; WARNING: recursion requested but not available ;; QUESTION SECTION: ;cname-test1.as12573.net. IN A ;; ANSWER SECTION: cname-test1.as12573.net. 3600 IN CNAME ghs.google.com. ;; Query time: 1 msec ;; SERVER: 212.204.207.192#53(212.204.207.192) ;; WHEN: Fri Jul 9 00:46:44 2010 ;; MSG SIZE rcvd: 69 So as you can see, 'invalid' status SERVFAIL for a +norec query after a while changes to the correct NOERROR status, and 'invalid' status NOERROR for a +rec query after a while changes to the correct SERVFAIL status... It seems related to the packetcache, because if we set the cache-ttl value from 20 to 0 (no cache), the strange behaviour disappears and always give the same (correct) results on all queries... We use pdns-2.9.22-3.x86_64.rpm from epel, on CentOS 5.5 and a source-compiled 2.9.22 on a FreeBSD 6 server. Since we noticed this problem, I've now also applied commit 1344 due to ticket 223, (send-root-referral=no was ignored), and that fixed that particular issue. And also commit 1407, but that doesn't seem to help for this issue unfortunately... So I currently have cache-ttl=0 set... but I hope it won't slow things down too much. Does anyone (Bert?) have a nice explanation for this, like whether this is perhaps normal behaviour after all, and if it can have bad impact on (perhaps broken) resolvers ? Thanks! Regards, Wouter
smime.p7s
Description: S/MIME cryptographic signature
_______________________________________________ Pdns-users mailing list [email protected] http://mailman.powerdns.com/mailman/listinfo/pdns-users
