Re: unbound/dns issue (malformed packets?)
Hi Joe, The domain whatsapp.com doesn't guarantee integrity to you (they have dnssec turned off, at least last I checked). It's possible that someone got in your middle and inserted a bogus record. This being said I'M ignorant to the fact that nlnetlabs have changed their internal database, so this is likely not a corruption issue but stems from the wire. Hopefully my 2 cents are helpful. -peter On Sun, Sep 15, 2019 at 06:23:28PM -0700, Joe Barnett wrote: > I've been seeing some issues which I believe to be related to dns/resolving. > The short of it is that the results of > > # dig web.whatsapp.com > > start out as: > > ; <<>> DiG 9.4.2-P2 <<>> web.whatsapp.com > ;; global options: printcmd > ;; Got answer: > ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 57665 > ;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 0 > > ;; QUESTION SECTION: > ;web.whatsapp.com. IN A > > ;; ANSWER SECTION: > web.whatsapp.com. 3595IN CNAME mmx-ds.cdn.whatsapp.net. > mmx-ds.cdn.whatsapp.net. 55 IN A 31.13.70.49 > > ;; Query time: 6 msec > ;; SERVER: 192.168.254.254#53(192.168.254.254) > ;; WHEN: Sun Sep 15 14:46:24 2019 > ;; MSG SIZE rcvd: 87 > > which seems reasonable (and functional), but then soon become: > > ;; Warning: Message parser reports malformed message packet. > > ; <<>> DiG 9.4.2-P2 <<>> web.whatsapp.com > ;; global options: printcmd > ;; Got answer: > ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 40939 > ;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 0 > > ;; QUESTION SECTION: > ;web.whatsapp.com. IN A > > ;; ANSWER SECTION: > web.whatsapp.com. 3528IN CNAME mmx-ds.cdn.whatsapp.net. > mmx-ds.cdn.whatsapp.net. 30772 RESERVED0 A \# 4 1F0D4631 > > ;; Query time: 2 msec > ;; SERVER: 192.168.254.254#53(192.168.254.254) > ;; WHEN: Sun Sep 15 14:47:31 2019 > ;; MSG SIZE rcvd: 87 > > At which point I am no longer able to access web.whatsapp.com. Given that > whatsapp is a facebook property, I tried the above against facebook.com, > www.facebook.com, instagram.com, and www.instagram.com as well. With the > exception of instagram.com, the other three (facebook, www.facebook, > www.instagram) return a hex (?) formatted version of the IP address, similar > to what is seen in the later of the above examples. My thinking is (or was) > that there are some issues relating to fb's DNS. From outside of my > network, however, other resolvers seem to be able to continually resolve the > above names correctly. I don't know what those resolvers are, but > specifically I am referring to whatever Linode and DigitalOcean use in the > nameservers they provide to their basic Linux vms (I am using the default > network config in my vms at Linode and DigitalOcean). I have a suspicion > that Linode uses unbound, but I do not know how to verify that. Oh, as far > as I can tell, those facebook-family names *seem* to be the only names for > which I see this behavior -- all other names that I have tried to run > through dig (and nslookup) seem to return reasonable and seemingly correct > results. > > A bit about my (home) network. I have Cox cable internet service, an Arris > SBG7580-AC, and an OpenBSD 6.5 machine that sits between the modem and the > rest of the network. I(we) do use the modem in router mode (but without > using the built-in WiFi) as my wife's work git-up consists of a > pre-configured black-box of a Juniper device. Not wanting that device in > the rest of our network, I set the modem to "RoutedWithNAT" and the two > network devices plug into the modem, but provide two separate networks. For > remote ingress into the rest of the network, I set the modem's DMZ to point > to the OpenBSD box. My pf.conf does the usual small network stuff including > NAT, a bit of redirection, etc. It has changed very little in the past > several years. My unbound.conf is also nearly unchanged since I first set > it up when OpenBSD dropped bind and replaced it with unbound. My OpenBSD > machine provides name resolving for the rest of the network. My > unbound.conf follows: > > server: > interface: 0.0.0.0 > interface: ::1 > do-ip6: no > > access-control: 0.0.0.0/0 refuse > access-control: 127.0.0.0/8 allow > access-control: 192.168.0.0/16 allow > access-control: 10.0.0.0/24 allow > access-control: 172.16.0.0/24 allow > access-control: ::0/0 refuse > access-control: ::1 allow > > hide-identity: yes > hide-version: yes > > # ftp://FTP.INTERNIC.NET/domain/named.cache > root-hints: "/var/unbound/etc/named.cache" > > # uncomment to enable DNSSEC > auto-trust-anchor-file: "/var/unbound/db/root.key" > > ### various local-zone, local-data, and local-date-ptr ### > > remote-control: > control-enable: yes > control-use-cert: yes >
unbound/dns issue (malformed packets?)
I've been seeing some issues which I believe to be related to dns/resolving. The short of it is that the results of # dig web.whatsapp.com start out as: ; <<>> DiG 9.4.2-P2 <<>> web.whatsapp.com ;; global options: printcmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 57665 ;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 0 ;; QUESTION SECTION: ;web.whatsapp.com. IN A ;; ANSWER SECTION: web.whatsapp.com. 3595IN CNAME mmx-ds.cdn.whatsapp.net. mmx-ds.cdn.whatsapp.net. 55 IN A 31.13.70.49 ;; Query time: 6 msec ;; SERVER: 192.168.254.254#53(192.168.254.254) ;; WHEN: Sun Sep 15 14:46:24 2019 ;; MSG SIZE rcvd: 87 which seems reasonable (and functional), but then soon become: ;; Warning: Message parser reports malformed message packet. ; <<>> DiG 9.4.2-P2 <<>> web.whatsapp.com ;; global options: printcmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 40939 ;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 0 ;; QUESTION SECTION: ;web.whatsapp.com. IN A ;; ANSWER SECTION: web.whatsapp.com. 3528IN CNAME mmx-ds.cdn.whatsapp.net. mmx-ds.cdn.whatsapp.net. 30772 RESERVED0 A \# 4 1F0D4631 ;; Query time: 2 msec ;; SERVER: 192.168.254.254#53(192.168.254.254) ;; WHEN: Sun Sep 15 14:47:31 2019 ;; MSG SIZE rcvd: 87 At which point I am no longer able to access web.whatsapp.com. Given that whatsapp is a facebook property, I tried the above against facebook.com, www.facebook.com, instagram.com, and www.instagram.com as well. With the exception of instagram.com, the other three (facebook, www.facebook, www.instagram) return a hex (?) formatted version of the IP address, similar to what is seen in the later of the above examples. My thinking is (or was) that there are some issues relating to fb's DNS. From outside of my network, however, other resolvers seem to be able to continually resolve the above names correctly. I don't know what those resolvers are, but specifically I am referring to whatever Linode and DigitalOcean use in the nameservers they provide to their basic Linux vms (I am using the default network config in my vms at Linode and DigitalOcean). I have a suspicion that Linode uses unbound, but I do not know how to verify that. Oh, as far as I can tell, those facebook-family names *seem* to be the only names for which I see this behavior -- all other names that I have tried to run through dig (and nslookup) seem to return reasonable and seemingly correct results. A bit about my (home) network. I have Cox cable internet service, an Arris SBG7580-AC, and an OpenBSD 6.5 machine that sits between the modem and the rest of the network. I(we) do use the modem in router mode (but without using the built-in WiFi) as my wife's work git-up consists of a pre-configured black-box of a Juniper device. Not wanting that device in the rest of our network, I set the modem to "RoutedWithNAT" and the two network devices plug into the modem, but provide two separate networks. For remote ingress into the rest of the network, I set the modem's DMZ to point to the OpenBSD box. My pf.conf does the usual small network stuff including NAT, a bit of redirection, etc. It has changed very little in the past several years. My unbound.conf is also nearly unchanged since I first set it up when OpenBSD dropped bind and replaced it with unbound. My OpenBSD machine provides name resolving for the rest of the network. My unbound.conf follows: server: interface: 0.0.0.0 interface: ::1 do-ip6: no access-control: 0.0.0.0/0 refuse access-control: 127.0.0.0/8 allow access-control: 192.168.0.0/16 allow access-control: 10.0.0.0/24 allow access-control: 172.16.0.0/24 allow access-control: ::0/0 refuse access-control: ::1 allow hide-identity: yes hide-version: yes # ftp://FTP.INTERNIC.NET/domain/named.cache root-hints: "/var/unbound/etc/named.cache" # uncomment to enable DNSSEC auto-trust-anchor-file: "/var/unbound/db/root.key" ### various local-zone, local-data, and local-date-ptr ### remote-control: control-enable: yes control-use-cert: yes control-interface: /var/run/unbound.sock do-ip6, root-hints, and auto-trust-anchor-file are somewhat recent additions to my unbound.conf, but I experience the same behavior with unbound.conf as above, and also when I comment out those three additions (bringing it back to a configuration that has worked for several years). My OpenBSD machine is an APU2 which I have been using without issue for over a year. My backup machine is an ALIX2D3 I think it is called. Other than the APU running amd64, and the ALIX running i386, the machines are otherwise configured exactly the same. The APU2 has been consistently maintained, and this