control: retitle -1 libc6: support for non-compliant nameserver should be improved control: severity -1 wishlist
On 2016-08-12 12:15, Vincent Lefevre wrote: > On 2016-08-12 09:26:10 +0200, Aurelien Jarno wrote: > > The libc does a first connection to the configured name server > > (192.168.0.1) using UDP. Note the size of the packet, very close to > > the 512 bytes limit without EDNS0 support. This very likely mean the > > answer is marked as truncated (look at the number of entries in the > > host answer). > > According to tcpdump output below, there is no truncation: the number > of A's and AAAA's (10 for each) match what "host keys.gnupg.net" > gives. BTW, even if there were a truncation, there shouldn't be a > failure: using of the returned IP addresses would be sufficient to > connect. That a wrong assumption. The libc getaddrinfo interface is not to connect to an IP, but rather to return *all* addresses corresponding to a query. The returned IPs are not necessarily used for a connection later. Not returning all addresses so might lead to data loss or security issue. On example among other is the forward-confirmed reverse DNS method used for example by some mail servers. Not returning all IPs might lead to a rejected or a discarded mail depending on the policy. The point is that the local resolver is supposed to be working correctly. If it doesn't, one can easily setup a local recursive name server like unbound. > 11:55:59.097743 IP 192.168.0.6.41008 > 192.168.0.1.domain: 60367+ A? > keys.gnupg.net. (32) > 11:55:59.097796 IP 192.168.0.6.41008 > 192.168.0.1.domain: 31606+ AAAA? > keys.gnupg.net. (32) > 11:55:59.098339 IP 192.168.0.6.38010 > 192.168.0.1.domain: 4217+ PTR? > 1.0.168.192.in-addr.arpa. (42) > 11:55:59.143100 IP 192.168.0.1.domain > 192.168.0.6.38010: 4217 NXDomain* > 0/1/0 (94) > 11:55:59.143325 IP 192.168.0.6.43592 > 192.168.0.1.domain: 23396+ PTR? > 6.0.168.192.in-addr.arpa. (42) > 11:55:59.161082 IP 192.168.0.1.domain > 192.168.0.6.41008: 60367 11/9/5 CNAME > pool.sks-keyservers.net., A 198.128.3.63, A 93.94.119.246, A 78.46.223.54, A > 131.175.15.4, A 151.252.40.184, A 5.9.50.141, A 209.135.211.141, A > 5.135.158.148, A 68.187.0.77, A 193.17.17.6 (502) This tcpdump trace doesn't show the answer header, so we don't know if the truncation flag is set. That said the 11/9/5 says that the answer contains 11 answer records, 9 name server records and 5 additional records. This clearly doesn't fit. A normal DNS server would just return 11 answers, so 11/0/0. That said I just realized that the strace entry in your previous email contains the beginning of the answer: > 30419 recvfrom(4, > "'J\203\200\0\1\0\v\0\10\0\0\4keys\5gnupg\3net\0\0\34\0\1"..., 2048, 0, > {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("192.168.0.1")}, > [16]) = 500 Converted into hexadecimal, this is: 27 4a 83 80 00 01 00 0b 00 08 00 00 04 6b 65 79 73 05 67 6e 75 70 67 03 6e 65 74 00 00 1c 00 01 274a is the identification. The flags are 8380 and corresponds to QR, TC, RD, RA. Your name server clearly says that the answer is truncated. On a working nameserver, the flags are 8180 for this query, so the same without the truncation flag. > > It therefore looks to me like a bug with your network setup, not a > > libc one. > > Well, though I didn't want that, this is quite a standard network > setup: my machine just uses DHCP with some standard ADSL modem > router. And given that many users have similar issues and there > isn't any problem with Android, I suppose that there's some bug > on the libc side (or libc can be improved). Even if it is a quite standard setup, you have to admit it doesn't behave according to the RFC. You should complain to the manufacturer and try to get a firmware update. Trying to workaround things on the libc side just gives even less value to the RFCs, and encourage selling broken hardware. > FYI, I also often get 5-second timeouts in name resolution whatever > the host (you can see it above): I get the answer for A or AAAA, but > sometimes, the other answer is lost. I have a DHCP hook that tests > whether I'm using this router: > > [...] > ping -n -c 1 -I "$interface" "$new_routers" > /dev/null > if grep -i -q $mac /proc/net/arp; then > logger "Google Public DNS with TCP to avoid recurrent timeout" > [...] This show how broken is your name server. It probably has problem with AAAA requests. Note that the RFC explicitly allows to not support some request types (including AAAA ones), but in that case the router must provide an answer that it doesn't support it and not simply drop it. You might want to try to workaround this by using "options single-request" or "options single-request-reopen" in etc/resolv.conf. In short it cleary shows that the problem comes from the name server and not the GNU libc: - the nameserver set the truncation bit - the nameserver doesn't answer on the TCP port - the nameserver sometimes drop AAAA queries With such a broken nameserver, I would advise you to use a local nameserver like unbound instead. The GNU libc might be improved to better cope with such broken nameservers, that say it is at most a wishlist severity and probably a wontfix as it requires the hardware to develop the workaround. Aurelien -- Aurelien Jarno GPG: 4096R/1DDD8C9B aurel...@aurel32.net http://www.aurel32.net