Re: [dns-operations] 'dnstap' (Re: Prevalence of query/response logging?)
On Fri, Jul 04, 2014 at 03:04:10PM -0700, Paul Vixie wrote: Roland Dobbins wrote: I know that some DNS operators disable logging of queries/responses due to the overhead of doing so - are most folks on this list with large-scale DNS recursive and/or authoritative DNS infrastructure disabling logging, enabling it, and/or logging queries/responses out-of-band via packet-capture taps, databases, etc.? we've been using PCAP to do passive dns collection since 2008 or so, and we've determined that it is unsuitable. the overhead of reassembling fragmented IP datagrams is fairly low, but the overhead and complexity of reassembling TCP streams is extreme -- post-processor i know is willing to pick through a TCP stream inside a PCAP file processing each transaction. Paul, I've written many many TCP/IP reassemblers and in fact the overhead is trivial. Your kernel does it all the time for example. The trick is to have a limited window in which you do the reassembly, and not scan over the entire file. Neither does a kernel. telemetry streams), and has been implemented on a development branch of unbound. we hope to get it into BIND and NSD in the next few months. we received an indication of no interest from powerdns, but i'm hoping bert will relent later on. Having said all that, it doesn't mean we aren't big fans of logging. But people I know are also big fans of logging being separate from their production servers, and this implies packets reassembly. This is why we have ample tooling in powerdns-tools to analyze packets. Packets also have the wonderful advantage that they represent what actually happened and not what the nameserver THOUGHT that happened. We've for example been able to solve debug many issues caused by malformed packets. Such malformed packets would probably not retained their unique malformation by serialization to dnstap. As another example, we've in the past had cases where our own logging showed we were serving answers with low latency yet packet traces showed signigicant time between query and response packets. The ultimate issued turned out to be queueing before our listening socket. Once we *got* the packet we answered it quickly enough. But we did not (and could not easily) account for when the packet hit the server. Our tool 'dnsscope' shows such statistics wonderfully. dnstap is completely open source, with a BSD-style license (Apache 2.0). it is sponsored by farsight because we need a uniform DNS telemetry format for our business purposes. we are giving it away in order to make the world better, primarily for our own products and customers, but by necessary extension, better for everybody including our competitors. It so happens that we now have the infrastructure to plug in arbitrary modules at packet entry exit, we could perhaps do a dnstap implementation there. Will keep you posted. Bert ___ dns-operations mailing list dns-operations@lists.dns-oarc.net https://lists.dns-oarc.net/mailman/listinfo/dns-operations dns-jobs mailing list https://lists.dns-oarc.net/mailman/listinfo/dns-jobs
Re: [dns-operations] 'dnstap' (Re: Prevalence of query/response logging?)
On Jul 7, 2014, at 3:52 PM, bert hubert bert.hub...@netherlabs.nl wrote: It so happens that we now have the infrastructure to plug in arbitrary modules at packet entry exit, we could perhaps do a dnstap implementation there. An IPFIX implementation would also be welcome - both a structured one for request/response data and perhaps PSAMP-over-IPFIX forwarding of the packets themselves, with a way to select which packets are selected for forwarding, and an optional sampler. -- Roland Dobbins rdobb...@arbor.net // http://www.arbornetworks.com Equo ne credite, Teucri. -- Laocoön ___ dns-operations mailing list dns-operations@lists.dns-oarc.net https://lists.dns-oarc.net/mailman/listinfo/dns-operations dns-jobs mailing list https://lists.dns-oarc.net/mailman/listinfo/dns-jobs
Re: [dns-operations] Prevalence of query/response logging?
On Fri, 4 Jul 2014 18:00:48 +0700 Roland Dobbins rdobb...@arbor.net wrote: I know that some DNS operators disable logging of queries/responses due to the overhead of doing so - are most folks on this list with large-scale DNS recursive and/or authoritative DNS infrastructure disabling logging, enabling it, and/or logging queries/responses out-of-band via packet-capture taps, databases, etc.? I've done all of the above. I like to think I was one of the earlier of adopters of enabling query logging at two reasonably large .edu institutions, which are still enabled as far as I know. This was for both authoritative and recursive, but recursive query logs were generally more interesting and useful to me at the time. I know a handful of folks who avoided doing query logging and continue to based on the assumption that it is too resource intensive, which may be true for some, but is not universally true and less true than I think many people realize. I had found syslog-ng was a much better alternative daemon on both the logging client and collector for a variety of reasons. On the client, I had found it to require less of the CPU than the stock syslog daemon at the time (Linux and Solaris systems). pcap-based solutions have been helpful for passive dns style projects, which tend not to be be strictly for network operations, but more research and insight oriented tasks. John ___ dns-operations mailing list dns-operations@lists.dns-oarc.net https://lists.dns-oarc.net/mailman/listinfo/dns-operations dns-jobs mailing list https://lists.dns-oarc.net/mailman/listinfo/dns-jobs
Re: [dns-operations] What's the story on gmail.fr?
On Jul 6 2014, Michele Neylon - Blacknight wrote: Gmail.ie redirects to Gmail correctly .. Though I've never seen them advertise Gmail using anything other than the .com gmail.uk gmail.co.uk are delegated to ns{1,2,3,4}.google.com, but they give REFUSED when queried about them. Same for gmail.co.nz, gmail.es, gmail.in, but I don't think I am going to work through all the ccTLDs! I expect only someone from Google could really explain what they are up to... -- Chris Thompson University of Cambridge Information Services, Email: c...@uis.cam.ac.ukRoger Needham Building, 7 JJ Thomson Avenue, Phone: +44 1223 334715 Cambridge CB3 0RB, United Kingdom. ___ dns-operations mailing list dns-operations@lists.dns-oarc.net https://lists.dns-oarc.net/mailman/listinfo/dns-operations dns-jobs mailing list https://lists.dns-oarc.net/mailman/listinfo/dns-jobs
Re: [dns-operations] What's the story on gmail.fr?
Seems obvious enough to me. The domains were bought to prevent a squatter getting them and they are parked with a company that does anti-phishing services (among other things). The Google DNS records are probably only intended to be read by the parking service and would be used to signal a change in the configuration. Well that's how I would do it. On Mon, Jul 7, 2014 at 11:42 AM, Chris Thompson c...@cam.ac.uk wrote: On Jul 6 2014, Michele Neylon - Blacknight wrote: Gmail.ie redirects to Gmail correctly .. Though I've never seen them advertise Gmail using anything other than the .com gmail.uk gmail.co.uk are delegated to ns{1,2,3,4}.google.com, but they give REFUSED when queried about them. Same for gmail.co.nz, gmail.es, gmail.in, but I don't think I am going to work through all the ccTLDs! I expect only someone from Google could really explain what they are up to... -- Chris Thompson University of Cambridge Information Services, Email: c...@uis.cam.ac.ukRoger Needham Building, 7 JJ Thomson Avenue, Phone: +44 1223 334715 Cambridge CB3 0RB, United Kingdom. ___ dns-operations mailing list dns-operations@lists.dns-oarc.net https://lists.dns-oarc.net/mailman/listinfo/dns-operations dns-jobs mailing list https://lists.dns-oarc.net/mailman/listinfo/dns-jobs ___ dns-operations mailing list dns-operations@lists.dns-oarc.net https://lists.dns-oarc.net/mailman/listinfo/dns-operations dns-jobs mailing list https://lists.dns-oarc.net/mailman/listinfo/dns-jobs
Re: [dns-operations] 'dnstap' (Re: Prevalence of query/response logging?)
Hi, Bert: bert hubert wrote: Paul, I've written many many TCP/IP reassemblers and in fact the overhead is trivial. Your kernel does it all the time for example. The trick is to have a limited window in which you do the reassembly, and not scan over the entire file. Neither does a kernel. Having QA'd IDSes in a past life, I don't disagree that the overhead, in terms of memory and CPU, ought to be minimal. However, the implementation complexity of a production grade TCP stream reassembler is high enough and the environment unforgiving enough that I'd prefer to hand the task off to a bullet-proof stand-alone library implementation. The last time I went looking for such an implementation I came up empty, but I'd love to be proven wrong. Having said all that, it doesn't mean we aren't big fans of logging. But people I know are also big fans of logging being separate from their production servers, and this implies packets reassembly. This is why we have ample tooling in powerdns-tools to analyze packets. Packets also have the wonderful advantage that they represent what actually happened and not what the nameserver THOUGHT that happened. We've for example been able to solve debug many issues caused by malformed packets. Such malformed packets would probably not retained their unique malformation by serialization to dnstap. As another example, we've in the past had cases where our own logging showed we were serving answers with low latency yet packet traces showed signigicant time between query and response packets. The ultimate issued turned out to be queueing before our listening socket. Once we *got* the packet we answered it quickly enough. But we did not (and could not easily) account for when the packet hit the server. Our tool 'dnsscope' shows such statistics wonderfully. I agree with you that in many cases being able to know what actually happened on the network vs what the DNS software thought had happened is quite handy, and I don't see packet capture as a technology being displaced for those cases when you want to get at the network-level artifacts. (I should note that dnstap will happily serialize malformed DNS *messages* [e.g., say some DNS record data is encoded incorrectly], but malformed *packets* are out-of-scope [e.g., say some middlebox corrupts a fragmented EDNS response and the receiver's kernel discards the packets instead of passing them to the nameserver process].) There are a lot of great use cases for DNS packet capture that can show network-level malfeasance (here I take an expansive view of network-level that includes everything after the initiator send()'s and the responder recv()'s) that will be awkward or impossible to replicate with an in-server logging facility like dnstap. Those use cases aren't what I'd like to focus on with dnstap. It's a nice bonus that the in-server approach obviates the need to condition the input by extracting DNS payload content from the lower layer frames (reassembling IP fragments, TCP streams, etc.), but that's not the primary reason I started working on the dnstap idea, however. The original, motivating use case for dnstap is passive DNS replication, and specifically the kind of hardened passive DNS replication that we implemented at Farsight (well, originally at ISC). It's worth quoting from Florian Weimer's original passive DNS paper on the hardening difficulties: Most DNS communication is transmitted using UDP. The only protection against blindly spoofed answers is a 16 bit message ID embedded in the DNS packet header, and the number of the client port (which is often 53 in inter-server traffic). What is worse, the answer itself contains insufficient data to determine if the sender is actually authorized to provide data for the zone in question. In order to solve this problem, resolvers have to carefully validate all DNS data they receive, otherwise forged data can enter their caches. (Passive DNS Replication § 3.3, Verification) There are two interrelated issues here that Florian left to future implementers: + [B]lindly spoofed [UDP] answers. We solved this in the capture component of our passive DNS system (dnsqr) by keeping a table of outstanding UDP queries and doing full RFC 5452 (hi Bert!) § 9.1 style matching of the corresponding responses. + [T]he answer itself contains insufficient data to determine if the sender is actually authorized to provide data for the zone in question. This is trickier; basically there is nothing internal to the contents of a standalone DNS query/response transaction that allows us to evaluate the trustworthiness of the authority and additional sections of the response message. (For instance, if you see a query/response for the question name www.example.com, may the authority section specify NS records for example.com?) The tack we took for this problem is to passively build
Re: [dns-operations] Why would a recusrive caching server not resolve a CNAME?
On Sun, Jul 6, 2014 at 3:45 PM, Mohamed Lrhazi ml...@georgetown.edu wrote: Thanks Lyle, I did not mean to say that list was defunct, quite the opposite, I felt that I was a bit spamming it with non global operational DNS issue Nah. No worries. This list traditionally has a wide range of topics. Your mail was polite and sane, and you seem interested in learning. Entirely appropriate (IMO) for this list. W Yes, I did not debug well, went for the quick fix of clearing the cache That being said, end users query for the name and the IPs, always... but then here they were only getting the CNAME... so am trying to figure out in what circumstances would that occur... Would a recursive resolver that has the CNAME in cache, but the A records expired, fail to resolve the A records, return just the CNAME? Thanks a lot, Mohamed. On Sun, Jul 6, 2014 at 3:13 PM, Lyle Giese l...@lcrcomputer.net wrote: You waited less than an hour before proclaiming the list defunct? It's Sunday in most of the world. Most of us are doing other things than sitting on this list. That said, my initial thought is that your server answered your question. Nothing more. Did you ask it for the A record for googlemail.l.google.com ? That might have told you more. Lyle Giese LCR Computer Services, Inc. On 07/06/14 13:38, Mohamed Lrhazi wrote: I am thinking this list is not appropriate for some of my questions... Could someone suggest a better one, maybe as active and rich, as this one, but more appropriate for general DNS discussions? Thanks a lot, Mohamed. On Sun, Jul 6, 2014 at 2:02 PM, Mohamed Lrhazi ml...@georgetown.edu wrote: We had a little mail outage which turned out to be caused by one of our caching DNS servers returning the bellow incomplete reply. Clearing the cache on the problematic server fixed the issue Am thinking it is now impossible for me to find the root cause in this instance... but wondering if you guys could hint at what could cause such a problem... bugs in the DNS servers involved? temporary misconfig at Google's servers? network issue? The setup is a bit convoluted: cache server -- resolver cache server -- Internet The fix was clearing at the first server. so I am guessing at some point the resolver gave the incomplete answer. Thanks a lot, Mohamed. ➜ ~ dig mail.google.com @141.161.100.201 ; DiG 9.9.5-3-Ubuntu mail.google.com @141.161.100.201 ;; global options: +cmd ;; Got answer: ;; -HEADER- opcode: QUERY, status: NOERROR, id: 20414 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 4, ADDITIONAL: 5 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ;; QUESTION SECTION: ;mail.google.com. IN A ;; ANSWER SECTION: mail.google.com. 10213 IN CNAME googlemail.l.google.com. ;; AUTHORITY SECTION: google.com. 96485 IN NS ns2.google.com. google.com. 96485 IN NS ns3.google.com. google.com. 96485 IN NS ns4.google.com. google.com. 96485 IN NS ns1.google.com. ;; ADDITIONAL SECTION: ns3.google.com. 108462 IN A 216.239.36.10 ns4.google.com. 108462 IN A 216.239.38.10 ns1.google.com. 108462 IN A 216.239.32.10 ns2.google.com. 108462 IN A 216.239.34.10 ;; Query time: 22 msec ;; SERVER: 141.161.100.201#53(141.161.100.201) ;; WHEN: Sun Jul 06 12:42:09 EDT 2014 ;; MSG SIZE rcvd: 207 ___ dns-operations mailing list dns-operations@lists.dns-oarc.net https://lists.dns-oarc.net/mailman/listinfo/dns-operations dns-jobs mailing list https://lists.dns-oarc.net/mailman/listinfo/dns-jobs ___ dns-operations mailing list dns-operations@lists.dns-oarc.net https://lists.dns-oarc.net/mailman/listinfo/dns-operations dns-jobs mailing list https://lists.dns-oarc.net/mailman/listinfo/dns-jobs ___ dns-operations mailing list dns-operations@lists.dns-oarc.net https://lists.dns-oarc.net/mailman/listinfo/dns-operations dns-jobs mailing list https://lists.dns-oarc.net/mailman/listinfo/dns-jobs ___ dns-operations mailing list dns-operations@lists.dns-oarc.net https://lists.dns-oarc.net/mailman/listinfo/dns-operations dns-jobs mailing list https://lists.dns-oarc.net/mailman/listinfo/dns-jobs
Re: [dns-operations] 'dnstap' (Re: Prevalence of query/response logging?)
On Jul 8, 2014, at 2:42 AM, Robert Edmonds edmo...@mycre.ws wrote: The original, motivating use case for dnstap is passive DNS replication, and specifically the kind of hardened passive DNS replication that we implemented at Farsight (well, originally at ISC). I think dnstap is a very good idea; still, it would be helpful to understand why it wasn't implemented in IPFIX, rather than in a custom telemetry format . . . -- Roland Dobbins rdobb...@arbor.net // http://www.arbornetworks.com Equo ne credite, Teucri. -- Laocoön ___ dns-operations mailing list dns-operations@lists.dns-oarc.net https://lists.dns-oarc.net/mailman/listinfo/dns-operations dns-jobs mailing list https://lists.dns-oarc.net/mailman/listinfo/dns-jobs
Re: [dns-operations] 'dnstap' (Re: Prevalence of query/response logging?)
Roland Dobbins wrote: I think dnstap is a very good idea; still, it would be helpful to understand why it wasn't implemented in IPFIX, rather than in a custom telemetry format . . . We did not frame the evaluation in terms of selecting (or building) a particular telemetry format. Instead we focused on some finer grained functional areas where we knew we would have to build or select particular components: 1) The dnstap idea entails modifying existing DNS servers, adding inline payload logging capabilities to the fast path of the DNS server. Performance is a key consideration, and we would prefer to have the capability to, under high load, drop excess logging payloads rather than block the server from making progress at its real job of returning answers to clients. So we need some sort of asynchronously-processed circular queue that can offload as much of this work from the DNS server's critical path. 2) A way of encoding the log payload from the DNS server's internal, in-memory representation, to a serialized byte sequence that can be transported over something like a socket or to a file. (The encoding.) 3) A way of actually transporting the serialized log payload to a receiver over something like a socket or file. (The transport.) I don't believe IPFIX has much to offer for #1, since this is an overly specific (yet quite important) implementation detail. We ended up writing our own lockless memory-barrier based circular buffer implementation, based on a technique used in the Linux kernel: https://www.kernel.org/doc/Documentation/circular-buffers.txt and then placing this in a library for re-use in different applications. If you combine #1 and #3 above and allow them to be implemented in a single package, one obvious contender is ZeroMQ; ultimately I think ZeroMQ is not that great of a choice for embedding *directly* in DNS servers for a few different reasons: e.g., there are several different versions (the Debian archive offers ZeroMQ major versions 2.x, 3.x, and 4.x) and the compatibility guarantees are somewhat convoluted. So we did not select ZeroMQ for use in the DNS server-side component. But I didn't want to preclude the possibility of re-sending dnstap payloads over binary-clean transports that are transparent to payload content like ZeroMQ, hence the transport/encoding split between #2 and #3. It looks like maintaining the #2/#3 transport/encoding split with IPFIX is impossible; it appears IPFIX is tightly coupled to the IP transport protocol: there is an IPFIX-over-UDP, IPFIX-over-TCP, IPFIX-over-SCTP... What if you want to send payloads over an AF_UNIX socket, or via an HTTP(S) GET/POST, WebSockets connection, some new technology that hasn't been invented yet, etc.? Enforcing a firm separation between a generic lower-level transport and a specific upper-level encoding is something that worked out pretty well for us in a different context: http://www.caida.org/workshops/isc-caida/1210/slides/isc1210_redmonds.html I say appears above because my next complaint is that there are too many specifications documents for IPFIX. There are several dozen listed here: https://datatracker.ietf.org/wg/ipfix/documents/ This is in contrast to generic serialization systems for structured data like Protocol Buffers, Thrift, Apache Avro, MessagePack, Cap'n Proto, BSON, etc. etc. Most of these can be described in a single fairly succinct document each; IPFIX appears to encompass a lot more than just serialization of structured data and consequently has a much larger specification footprint. If IPFIX is well-suited for applications other than representing IP flows, it is awfully hard to tell from the outside without plowing through a ton of specifications. This is itself a downside; we have to convince not just ourselves, but DNS software vendors to import this code and DNS software users that they might want to use this code. For a dnstap file format I was awfully tempted to use the traditional pcap-savefile(5) format with a new linktype, but pcap has a hard 64K frame size limit, which would make it impossible to represent dnstap payloads with maximally sized DNS messages in a single frame, which I wanted to make a hard requirement for dnstap. I tried to find the analogous limit for IPFIX, which appears to also use a 16-bit field to represent message length. (Possibly IPFIX can split payloads across multiple messages, but if it can, this is not readily apparent, and we would prefer not to have to invoke such a capability anyway.) Also, I found the following blog post rather interesting: http://www.ntop.org/nprobe/why-nprobejsonzmq-instead-of-native-sflownetflow-support-in-ntopng/ The fact that not even flow probe vendors are happy with IPFIX is somewhat telling. I do not know enough about flow probes to evaluate most of his very specific technical complaints with IPFIX, but something like JSON or
Re: [dns-operations] 'dnstap' (Re: Prevalence of query/response logging?)
On Jul 8, 2014, at 9:33 AM, Robert Edmonds edmo...@mycre.ws wrote: It looks like maintaining the #2/#3 transport/encoding split with IPFIX is impossible It's an abstract goal which isn't an operational consideration, IMHO. If IPFIX is well-suited for applications other than representing IP flows, it is awfully hard to tell from the outside without plowing through a ton of specifications. This is itself a downside; we have to convince not just ourselves, but DNS software vendors to import this code and DNS software users that they might want to use this code. It's actually pretty well-known in the operational community that IPFIX is extensive and well-suited to this sort of task. In terms of collection/analysis tools, putting this sort of data into a standardized format would go a long ways towards ensuring that tools are available which can use it, and offer easy combinatorial analysis with data such as more classical flow records from routers/layer-3 switches, policy and even data from things like firewalls and load-balancers, etc. Adoption by DNS vendors/coders is only half the battle, and since this is Something New to DNS vendors/coders, but not to vendors/coders of telemetry analysis systems, it's a consideration. to represent dnstap payloads with maximally sized DNS messages in a single frame, which I wanted to make a hard requirement for dnstap. Is this intended to help with performance, or . . . ? (Possibly IPFIX can split payloads across multiple messages, It can. What if you want to send payloads over an AF_UNIX socket, or via an HTTP(S) GET/POST, WebSockets connection, some new technology that hasn't been invented yet, etc.? These are corner-cases, IMHO. I say appears above because my next complaint is that there are too many specifications documents for IPFIX. One can say the same of DNS. ; All it would've taken to get the relevant information was asking folks plugged into the flow telemetry community about these various issues, rather than wading through lots of docs. Also, I found the following blog post rather interesting: Without wasting a lot of space here dissecting it, there's a great deal he says which I don't agree with, FWIW.v It just doesn't look like a good fit for what we want to make possible, and there are a lot of general purpose technologies out there that I would consider first before considering IPFIX for a particular application. The barrier to adoption, as I see it, is that this is now a one-off telemetry format, which generally (not always) tends to lead to low-to-no adoption. I fear that's the case here (hopefully, I'm wrong!). I personally think going the one-off route (pardon the pun, heh), no matter the perceived benefits, is self-defeating; I'll look into what's necessary to somehow transcode this into IPFIX, but it would've been much simpler (and much more likely to be widely adopted, IMHO) if IPFIX had simply been adopted in the first place. I'm really grateful for the detailed explanation and insight into your thinking; even though I don't agree with the goal prioritization (a primary goal of any form of telemetry export should be relatively easy compatibility with existing collection/analysis systems and the use of formats with which there's likely going to be some degree of familiarity and experience with same, in order to maximize the chances of broad adoption) - it's educational and much appreciated! -- Roland Dobbins rdobb...@arbor.net // http://www.arbornetworks.com Equo ne credite, Teucri. -- Laocoön ___ dns-operations mailing list dns-operations@lists.dns-oarc.net https://lists.dns-oarc.net/mailman/listinfo/dns-operations dns-jobs mailing list https://lists.dns-oarc.net/mailman/listinfo/dns-jobs