Re: [dns-operations] Name servers returning incorrectly truncated UDP responses
On Sat, Jul 30, 2022 at 11:14 AM Ondřej Surý wrote: > I am 99% sure the fpdns is wrong and this is not djbdns. The fpdns relies > on subtle differences between various DNS implementations and is often > wrong because there’s either not enough data or just not enough > differences. That’s what I’ve seen when we started with Knot DNS - the > quality implementations just didn’t had enough differences between than > because they adhered to standards that fpdns just could not tell the > difference. > If there are SUBTLE differences between DJB DNS and anything else, I would die of shock. I will offer a beer to anyone who shows me anything even remotely close to as broken as that POS. (The DNS community should maybe start offering rewards for replacing it, to anyone running djbdns, in real currency, just to purge it from the internet.) Brian > > Cheers, > -- > Ondřej Surý (He/Him) > > On 30. 7. 2022, at 19:37, Puneet Sood wrote: > > > > > On Sat, Jul 30, 2022 at 10:26 AM Dave Lawrence wrote: > >> Greg Choules via dns-operations writes: >> > I am including in this mail the RNAME from the SOA (same for both >> > zones) in the hope that someone who is responsible for DNS at Sony >> > entertainment will see this and take note. >> >> And tell us what in the world DNS software they're running, and why >> they chose it. >> > > Jaap up-thread used fpdns to figure out the first question. > > fpdns e.ns.email.sonyentertainmentnetwork.com > fingerprint (e.ns.email.sonyentertainmentnetwork.com, 207.251.96.133): DJ > Bernstein TinyDNS 1.05 [Old Rules] > >> ___ >> dns-operations mailing list >> dns-operations@lists.dns-oarc.net >> https://lists.dns-oarc.net/mailman/listinfo/dns-operations >> > ___ > dns-operations mailing list > dns-operations@lists.dns-oarc.net > https://lists.dns-oarc.net/mailman/listinfo/dns-operations > ___ dns-operations mailing list dns-operations@lists.dns-oarc.net https://lists.dns-oarc.net/mailman/listinfo/dns-operations
Re: [dns-operations] Input from dns-operations on NCAP proposal
On Fri, Jun 3, 2022 at 3:17 PM John R Levine wrote: > On Fri, 3 Jun 2022, John Levine wrote: > >> In such a configuration, if the host name "foo" matches the candidate > TLD > >> "foo", and the latter is changed from NXDOMAIN ... > > > Do we have any idea how many systems still use search lists? We've been > saying > > bad things about them at least since .CS was added in 1991. > > It occurs to me there is another way to look at this. There are already > 1487 delegated TLDs, and I doubt anyone could name more than a small > fraction of them. If this increases the number of names that will break > search lists from 1487 to 1488, how much of a problem is this likely to be > in practice, which leads back to ... > > If it was ONLY a progression of 1487->1488, it might not be that bad (but again, that all depends on what number 1488 actually is.) What it is actually is an exercise in survivorship bias. Anyone who might have been impacted by any of the earlier rounds of expansion, will (likely) have learned their lesson. That lesson may depend on tribal knowledge, which might not be reliable enough for any previous victim to not be re-victimized. Anyone not previously affected may be unaware of the risk their own set-up places them in, until their choices run up against newly deployed TLDs. Until the practice or standard/implementation for search-lists is fully deprecated, the risk will remain, for either new TLDs being deployed or new host names or naming conventions being deployed. Unimaginative host names like "mail001" are likely safe. However, naming hosts after some class of entities, like manufacturers or fast food companies or even classes of things, will ironically be risky. The best analogy I can think of is playing "minesweeper" on a huge board, where the number of mines periodically gets increased, where there are no signals of adjacent mines (1-8), no flags, and no automatic flooding of zero-mine areas. Spots you have clicked on could be subsequently mined, and you lose. It is an asynchronous race condition, where an external party is making moves (adding mines) on your behalf. It would not be considered a "fun" game, IMNSHO. Brian P.S. Having "ndots:N" for N>0 isn't necessarily safe, either. Any new TLD that matches an internal namespace component rather than hostname, won't necessarily be discovered until registrations begin. ___ dns-operations mailing list dns-operations@lists.dns-oarc.net https://lists.dns-oarc.net/mailman/listinfo/dns-operations
Re: [dns-operations] Input from dns-operations on NCAP proposal
On Fri, Jun 3, 2022 at 11:57 AM Thomas, Matthew via dns-operations < dns-operati...@dns-oarc.net> wrote: > > > > -- Forwarded message -- > From: "Thomas, Matthew" > To: "d...@virtualized.org" , "pspa...@isc.org" < > pspa...@isc.org> > Cc: "vladimir.cunat+i...@nic.cz" , " > dns-operati...@dns-oarc.net" > Bcc: > Date: Fri, 3 Jun 2022 18:48:57 + > Subject: Re: Re: [dns-operations] Input from dns-operations on NCAP > proposal > Thank you David. That change from NXDOMAIN to NOERROR/NODATA and things > going "boom" is exactly what we are looking for community input towards. > Do folks know of applications, or things like suffix search list > processing, that will change their behavior. > > There is one particular non-default configuration that definitely would make things go "boom". (This is not a comprehensive list of behaviors, just one example that is known.) If the options value of "ndots:N" is set in /etc/resolv.conf (or whatever analogous configuration elements exist in non-Unix/linux systems) to a value of N==0, then a lookup for a single label name (e.g. "foo") would be made as an absolute query first, before doing search list additions. "ndots" can generally be any number between 0 and X, for implementation-specific X. Some implementations cap X at 15, some at 255, there may be other implementations. In such a configuration, if the host name "foo" matches the candidate TLD "foo", and the latter is changed from NXDOMAIN (non-existing in the root) to anything else (e.g. a delegation is made for "foo"), this will break search list processing for "foo". I.e. earth-shattering kaboom. BEFORE: "foo" => NXDOMAIN, resolver then tries various "foo.bar.example.com", "foo.example.com" etc. AFTER: "foo" => not NXDOMAIN, resolver stops after the answer it gets (especially if there is a matching QTYPE and RRTYPE in the Answer, such as QTYPE == A, answer is 127.0.53.53) Brian > Matt > > On 6/2/22, 5:22 PM, "David Conrad" wrote: > > Hi, > > On Jun 1, 2022, at 12:39 AM, Petr Špaček wrote: > > On 24. 05. 22 17:54, Vladimír Čunát via dns-operations wrote: > >>> Configuration 1: Generate a synthetic NXDOMAIN response to all > queries with no SOA provided in the authority section. > >>> Configuration 2: Generate a synthetic NXDOMAIN response to all > queries with a SOA record. Some example queries for the TLD .foo are below: > >>> Configuration 3: Use a properly configured empty zone with correct > NS and SOA records. Queries for the single label TLD would return a NOERROR > and NODATA response. > >> I expect that's OK, especially if it's a TLD that's seriously > considered. I'd hope that "bad" usage is mainly sensitive to existence of > records of other types like A. > > > > Generally I agree with Vladimir, Configuration 3 is the way to go. > > > > Non-compliant responses are riskier than protocol-compliant > responses, and option 3 is the only compliant variant in your proposal. > > Just to be clear, the elsewhere-expressed concern with configuration 3 > is that it exposes applications to new and unexpected behavior. That is, > if applications have been “tuned” to anticipate an NXDOMAIN and they get > something else, even a NOERROR/NODATA response, the argument goes those > applications _could_ explode in an earth shattering kaboom, cause mass > hysteria, cats and dogs living together, etc. > > While I’ve always considered this concern "a bit" unreasonable, I > figure its existence is worth pointing out. > > Regards, > -drc > > > > > > > -- Forwarded message -- > From: "Thomas, Matthew via dns-operations" > To: "d...@virtualized.org" , "pspa...@isc.org" < > pspa...@isc.org> > Cc: "dns-operati...@dns-oarc.net" > Bcc: > Date: Fri, 3 Jun 2022 18:48:57 + > Subject: Re: [dns-operations] Input from dns-operations on NCAP proposal > ___ > dns-operations mailing list > dns-operations@lists.dns-oarc.net > https://lists.dns-oarc.net/mailman/listinfo/dns-operations > ___ dns-operations mailing list dns-operations@lists.dns-oarc.net https://lists.dns-oarc.net/mailman/listinfo/dns-operations
Re: [dns-operations] Input from dns-operations on NCAP proposal
he in-zone CNAME record to cause re-queries from resolvers, to estimate query volume Additional child record CNAMEs could be added with the same or similar target(s). - Each CNAME would be added to the root zone, since there is no delegation involved. - e.g. common-name.candidate-tld. CNAME some-other-target-that-is-cnamed-to-nxdomain.ncap.example.net. It would also be possible to add a wildcard CNAME below any FQDN, which would match any descendant of the FQDN for which no existing name was present in the zone. (Details of wildcard matching are omitted for brevity.) - e.g *.candidate-tld. CNAME wildcard-target-that-is-cnamee-to-nxdomain.ncap.example.net. It would be advisable to do this first, before any consideration of doing option 3. None of the other options is advisable. Brian Dickson P.S. This solution can be tested and validated relatively easily, as it only involves normal, standard DNS server(s) and supported record types. P.P.S. Of course, you would need to supply your own real domain name anywhere in the above that "example.net" appears. > > > Best, > > > > Matt Thomas > > NCAP Co-chair > > > > > > -- Forwarded message -- > From: "Thomas, Matthew via dns-operations" > To: "dns-operati...@dns-oarc.net" > Cc: > Bcc: > Date: Mon, 23 May 2022 13:48:12 + > Subject: [dns-operations] Input from dns-operations on NCAP proposal > ___ > dns-operations mailing list > dns-operations@lists.dns-oarc.net > https://lists.dns-oarc.net/mailman/listinfo/dns-operations > ___ dns-operations mailing list dns-operations@lists.dns-oarc.net https://lists.dns-oarc.net/mailman/listinfo/dns-operations
Re: [dns-operations] [Ext] Obsoleting 1024-bit RSA ZSKs (move to 1280 or algorithm 13)
On Thu, Oct 21, 2021 at 7:54 PM George Michaelson wrote: > I would be concerned that the language which makes the recommendation > HAS to also note the operational problems. You alluded to the UDP > packetsize problem. And implicitly the V6 fragmentation problem. What > about the functional limitations of the HSM and associated signing > hardware? I checked, and the units we operate (for other purposes than > DNSSEC) don't support RSA1280. They do RSA1024 or RSA2048. This is > analogous to the recommendation I frequently make casually, to stop > using RSA and move to the shorter cryptographic signature algorithms > to bypass the size problem: They are slower, and they aren't supported > by some hardware cryptographic modules. > Okay, yes, this was something I wasn't taking into consideration. (My apologies to everyone.) Everything is, to some degree or another, a trade-off. So, out of curiosity (and for a single data point I suppose), which non-RSA algorithms does your HSM support? If it includes one of the elliptic curve algorithms, I think the interesting thing would be the respective multipliers on slowdown and crypto strength (work factor). E.g. a 50x slowdown which produces, say, a 1000x work factor increase, would be worth considering seriously, but it is unclear what the work factor increase would be. I think additionally, anyone looking at what to do would probably need to determine two parameters: - Natural signing rate (e.g due to changes in data to be signed) - Re-signing time (speed x number of entries) There are places on the performance curves that are unsupportable, e.g. when the number of entries is large enough and the natural signing rate is high enough, that the re-signing time becomes infinite. In that situation, there are not a lot of alternatives: replace the HSM(s); scale horizontally with additional HSMs operating in parallel; use a faster (and presumably weaker) algorithm. The fourth option is to perform signing using non-HSM equipment, which has challenges of its own. > Even without moving algorithm, Signing gets slower as a function of > keysize as well as time to brute force. So, there is a loss of > "volume" of signing events through the system overall. Time to resign > zones can change. Maybe this alters some operational boundary limits? > (from what I can see, 1024 -> 1280 would incur 5x slowdown. 1024-2048 > would be 10-20x slowdown. RSA to elliptic curve could be 50x or worse > slowdown) > > If the case for "bigger" is weak, then if the consequences of bigger > are operational risks, maybe bigger isn't better, if the TTL bound > life, is less than the brute force risk? > > A totally fictitious example. but .. lets pretend somebody has locked > in to a hardware TPM, and it simply won't do the recommended algorithm > but would power on with 1024 until the cows come home? If the TTL was > kept within bounds, if resign could be done in a 10 day cycle rather > than a 20 day cycle (for instance) I don't see why the algorithm > change is the best choice. > > You are correct, and much depends on things like stability of the zone and total zone size. The ultimate limit is really the utilization level of the signing hardware. Once the hardware is operating full-out constantly, it is only a matter of time before the theoretical adversarial risk exceeds the zone operator's risk tolerance. If the hardware performance generally continues to improve along the current exponential scale (e.g CPU and GPU performance), signing hardware will eventually be obsolete and need replacing. Brian > On Fri, Oct 22, 2021 at 11:46 AM Brian Dickson > wrote: > > > > > > > > On Wed, Oct 20, 2021 at 10:22 AM Paul Hoffman > wrote: > >> > >> On Oct 20, 2021, at 9:29 AM, Viktor Dukhovni > wrote: > >> > >> > I'd like to encourage implementations to change the default RSA key > size > >> > for ZSKs from 1024 to 1280 (if sticking with RSA, or the user elects > RSA). > >> > >> This misstates the value of breaking ZSKs. Once a KSK is broken, the > attacker can impersonate the zone only as long as the impersonation is not > noticed. Once it is noticed, any sane zone owner will immediately change > the ZSK again, thus greatly limiting the time that the attacker has. > > > > > > This presupposes what the ZSKs are signing, and what the attacker does > while that ZSK has not been replaced. > > > > For example, if the zone in question is a TLD or eTLD, then the records > signed by the ZSK would include almost exclusively DS records. > > DS records do change occasionally, so noticing a changed DS with valid > signature is unlikely for anyone other than the operator of the > corresponding delegated zone. > &
Re: [dns-operations] [Ext] Obsoleting 1024-bit RSA ZSKs (move to 1280 or algorithm 13)
On Wed, Oct 20, 2021 at 10:22 AM Paul Hoffman wrote: > On Oct 20, 2021, at 9:29 AM, Viktor Dukhovni > wrote: > > > I'd like to encourage implementations to change the default RSA key size > > for ZSKs from 1024 to 1280 (if sticking with RSA, or the user elects > RSA). > > This misstates the value of breaking ZSKs. Once a KSK is broken, the > attacker can impersonate the zone only as long as the impersonation is not > noticed. Once it is noticed, any sane zone owner will immediately change > the ZSK again, thus greatly limiting the time that the attacker has. > This presupposes what the ZSKs are signing, and what the attacker does while that ZSK has not been replaced. For example, if the zone in question is a TLD or eTLD, then the records signed by the ZSK would include almost exclusively DS records. DS records do change occasionally, so noticing a changed DS with valid signature is unlikely for anyone other than the operator of the corresponding delegated zone. An attacker using such a substituted DS record can basically spoof anything they want in the delegated zone, assuming they are in a position to do that spoofing. And how long those results are cached is controlled only by the resolver implementation and operator configuration, and the attacker. So, the timing is not the duration until the attack is noticed (NOTICE_DELAY), it is the range MIN_TTL to MIN_TTL+NOTICE_DELAY (where MIN_TTL is min(configured_TTL_limit, attacker_supplied_TTL)). The ability of the operator of the delegated zone to intervene with the resolver operator is not predictable, as it depends on what relationship, if any, the two parties have, and how successful the delegated zone operator is in convincing the resolver operator that the cached records need to be purged. Stronger ZSKs at TLDs is warranted even if the incremental improvement is less than what cryptographers consider interesting, IMNSHO. It's not an all-or-nothing thing (jump by 32 bits or don't change), it's a question of what reasonable granularity should be considered in increments of bits for RSA keys. More of those increments is better, but at least 1 such increment should be strongly encouraged. I think Viktor's analysis justifies the suggestion of 256 bits (of RSA) as the granularity, and thus recommending whatever in the series 1280, 1576, 1832, 2048 the TLD operator is comfortable with, with recommendations against going too big (and thus tripping over the UDP-TCP boundary). > In summary, it is fine to propose that software default to issuing larger > RSA keys for ZSKs, but not with an analysis that makes a lot of unstated > guesses. Instead, it is fine to say "make them as large as possible without > causing automatically needing TCP, and ECDSA P256 is a great choice at a > much smaller key size". > I'm fine with adding those to the recommendations (i.e. good guidance for the rationale for picking ZSK size and/or algorithm), with the added emphasis on not doing nothing. Brian ___ dns-operations mailing list dns-operations@lists.dns-oarc.net https://lists.dns-oarc.net/mailman/listinfo/dns-operations
Re: [dns-operations] Verisign won't delete obsolete glue records?
On Mon, Mar 1, 2021 at 4:41 PM Doug Barton wrote: > > Thanks for the explanation about objects vs. host names. In this case > it's not a third party that is using the old names, it's still us, so we > don't want to "break" those delegations. > > Perhaps I didn't ask my question clearly enough. Let's take a delegation > for example.com to ns1.example.info and ns2.example.info. There will be > no host records at Verisign for those two names, right? So how are those > delegation host names represented in the database, and why can't my > now-obsolete glue records be represented the same way? > Okay, I think I understand better what you're asking. My understanding is that, even though the delegation is to an off-TLD name server, the registry still needs an object. So, the glue rules mean that object will have a name, but not have any addresses. Those objects' names are basically first-come, first served. But, if you rename them, the original name is no longer in existence. At that point, if you wanted to, you could create a new object with the now-vacated name. (This may even be what you want to do, one way or another.) I'm pretty sure you can't have different objects using the same name at the same time. And basically, if you want the other delegations to point to the same/original IP, or to the new name, what you really want to do is rename the host, not change the delegation of the domain. (I'm assuming you want all the domains to point to a new name, and not have any delegations pointing to the old name). If you did the re-delegation first, that could be a bit tricky. You might need to do the following: - Rename the new host record that was created to a throw-away name - Change the delegation to the original name (and re-connect to the original object) - Delete the now-unreferenced throw-away name - Rename the original object host to the new name you want to use for all your delegations Repeat the above for each name server host name. After the above steps, there will no longer be any host objects which are children of the "primary" domain. Thus, you won't need to try to delete anything, because the name will already no longer exist. (The object will, but it will have a new name.) Brian ___ dns-operations mailing list dns-operations@lists.dns-oarc.net https://lists.dns-oarc.net/mailman/listinfo/dns-operations
Re: [dns-operations] Verisign won't delete obsolete glue records?
On Mon, Mar 1, 2021 at 3:28 PM Doug Barton wrote: > I'm being told something by my registrar which I find impossible to > believe, but they keep telling me that they have accurately transmitted > my request, and that the answer is no. "Let me 'splain. No, there is too > much. Let me sum up." > > > So what am I missing here? I know that in the past it was possible, and > in fact desirable, to remove those obsolete glue records, but now it's > impossible to do it? > Not speaking with knowledge of the specifics, only concerning the general case: The RRR (registry/registrar/registrant) system is somewhat complex, and arcane. The common language used, EPP, is capable of representing relationships, but is restrictive. The root problem is the object model (tied to the database nature of registries). A glue record is basically a host record, with a name and IP address(es). Domains (registered with the registry, belonging to registrants) have their delegations represented as references to host records. This is where things break down: the delegation is to the object, not the name. If you change your delegations to a different name, that will either change the reference to a different object, or possibly create a new object and use that for its delegation reference. The old object (with the original name) still exists. If (and ONLY if) there are no other references (i.e. delegations) to that object, can the object be deleted. That rule is enforced, and is tied to the database model for hosts and domains. You do generally have the option of renaming the object, and there are some interesting options available. One is to change the name to an off-TLD name, in which case the corresponding IP address(es) are removed. Using an off-TLD name that is deliberately and permanently unresolvable is a nice, clean way of "breaking" the other domains, who should really not have been using your name server as their name server without your permission. An example name would be "SOME_RANDOM_VALUE".empty.as112.arpa (empty.as112.arpa is a zone intended to never have any non-apex records, as the name suggests, and its existence is defined for that purpose in RFC 7535). For "SOME_RANDOM_VALUE", it is recommended that you use a GUID type generated value for the label, to ensure it does not collide with anyone else doing the same thing. (There are others doing this already.) Hope this helps explain the situation. (It's not your fault, and it isn't the registry's fault, it is whoever has for whatever reason delegated some other domain to your name server that has caused the problem.) Brian ___ dns-operations mailing list dns-operations@lists.dns-oarc.net https://lists.dns-oarc.net/mailman/listinfo/dns-operations
Re: [dns-operations] Quad9 DNSSEC Validation?
On Mon, Mar 1, 2021 at 2:16 PM Viktor Dukhovni wrote: > On Mon, Mar 01, 2021 at 09:12:38AM +0100, Petr Špaček wrote: > > > In my experience negative trust anchors for big parts of MIL and/or GOV > > are way more common, let's not pick specifically on Quad9. For periods > > of time I have seen with other big resolver operators as well. > > On the .gov side, just 10 of 1239 domains fail to return validated > DNSKEY RRsets (with rounded number of weeks duration): > > weeks | domain >---+ > > 148 | uscapitolpolice.gov Just an observation, in terms of real world implications of DNSSEC validation failures: I hope this wasn't in any way a contributing factor in the 2021-01-06 events/response. Brian ___ dns-operations mailing list dns-operations@lists.dns-oarc.net https://lists.dns-oarc.net/mailman/listinfo/dns-operations
Re: [dns-operations] [Ext] Possibly-incorrect NSEC responses from many RSOs
On Sun, Feb 28, 2021 at 12:37 PM Viktor Dukhovni wrote: > On Sun, Feb 28, 2021 at 08:52:38PM +0100, Vladimír Čunát wrote: > > > On 2/28/21 8:47 PM, Paul Hoffman wrote: > > >> [1]https://tools.ietf.org/html/rfc8482#section-7 [tools.ietf.org] > > > That RFC (a) doesn't update RFC 4025 and (b) is only about QTYPE of > "ANY". > > > > I meant just the informal future-work note focused on QTYPE=RRSIG (in > > the linked section), to support my claim that there are advantages in > > avoiding full replies to such queries. > > Not only are "full" replies not needed, detached from the RRSet for > which an RRSIG is the signature, the content of the RRSIG is both > useless and meaningless. Since it can never be validated it should not > be cached. > > An interoperable synthetic reply when the qname exists would be: > > 0 IN RRSIG RRSIG 255 0 <0x0> <0x0> 0 > AA== > > A signature payload of a single 0 byte avoids potential issues with > unexpected zero-length signatures. > > * It is less clear what to do when the qname is wildcard-synthesized. > Should there be NSEC records to validate a wildcard-based response??? > > My take is "no", just always set the closest encloser to equal the > qname, and let the zero TTL take care of not having such replies > stick around in caches to imply anything about the node. > > Iterative resolvers should not cache RRSIG replies, regardless of TTL. > > I'm writing a new stub resolver for Haskell, and even prior to this > thread my plan was to not permit RRSIG queries, because they made no > sense. I could instead just return the above synthetic response without > asking any upstream server, but an error telling the user they're doing > the wrong thing seems more appropriate. > I think this is vaguely interesting, if for no other reason that it exposes some weaknesses in the original 403[345] RFCs. Two relevant questions are: - Are the observed RRSIG queries the result of an actual client's behavior? (The alternative being diagnostic tool usage, and I'd be surprised if it was the former.) - If a validating client is either using a security-oblivious resolver (the client might be a stub or a forwarder), or has an intermediate network device interfering with DNSSEC, is it even possible to do validation, ever? It's clear that the relevant portion(s) of 403[345] don't correctly handle the RRSIG case, and pretty much cannot (since RRSIG needs to know the original QTYPE to select/filter the RRSIG). If (big IF) there is interest in solving for the second case (validating client behind a middle box or resolver that is not returning RRSIG records, possibly not doing EDNS, and may or may not be security-aware with respect to CD and AD bits), what then? It might be interesting to know how much of the Internet is in those situations (validating stub or validating resolver, but unable to actually validate). The follow-up question, if there is a substantial portion of the client base that is impacted, which path is more likely to occur on a timely basis: - Removal/upgrade of resolvers or middle boxes causing these issues - Deployment of new code on resolvers and clients with ways of addressing the RRSIG issue (I don't think there is any real reason or value for auth-only servers to do anything different, or at most only add the auth piece of any new logic.) If the latter (deployment of new code) is the path of least resistance (which would be unpleasant, obviously), the question would then be: how would a client signal to a server, that it wants RRSIG records for a specific signed RRSET/RRTYPE? The assumption would probably be a worst-case scenario: no EDNS, but possibly transparent path for AD/CD bits, and possibly support for new OPCODEs. (Testing real paths might be needed for the OPCODE support.) The methods I can think of are basically: - Underscore added to QNAME, to indicate the second QTYPE (either _RRSIG.QNAME for QTYPE==thing that is signed by the RRSIG, or _QTYPE.QNAME with real QTYPE being RRSIG). - New OPCODE for RRSIG, so instead of OPCODE==0 and QTYPE==FOO, have OPCODE==RRSIG and QTYPE==FOO - The returned reply would either be just the RRSIG of the right QTYPE, or the answer of QTYPE RRSET in the Answer section, and the RRSIG(s) in the Additional section Absent the above, it is probably fair to exclude RRSIG from things that can get sensible answers, and 403[345] should be updated to clarify. (IMHO, the extra logic might not be too bad, and would potentially be useful for advancing the deployment of validating stubs.) Brian ___ dns-operations mailing list dns-operations@lists.dns-oarc.net https://lists.dns-oarc.net/mailman/listinfo/dns-operations
[dns-operations] Broken A and J root responses
This is of interest to both resolver operators and Verisign. We have noticed broken responses to certain query types from some instances of A and J. This was raised originally by David Kinzel, BTW, on the DNS-OARC Mattermost channels. We have seen queries for NSEC for both "jp" and "sl" return results that could/would poison the root delegation NS set (and this was what David saw that started the investigation). See below for the query/response. Note the Authority section in particular. Brian Dickson GoDaddy dig +do +norec @a.root-servers.net nsec sl. +nsid ; <<>> DiG 9.16.7 <<>> +do +norec @a.root-servers.net nsec sl. +nsid ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 27231 ;; flags: qr aa; QUERY: 1, ANSWER: 2, AUTHORITY: 3, ADDITIONAL: 3 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags: do; udp: 4096 ; NSID: 6e 6e 6e 31 2d 73 66 6f 37 ("nnn1-sfo7") ;; QUESTION SECTION: ;sl. IN NSEC ;; ANSWER SECTION: sl. 86400 IN NSEC sling. NS RRSIG NSEC sl. 86400 IN RRSIG NSEC 8 1 86400 2021031117 2021022616 42351 . CQf3h+rHcoK2WSn7ItV8IQLb6yFFXSA+Lt86S58sm32u7QtTJsepap6r LcREA16YEmr5N9U7ytPyqNZmH92q24XGAtB0bikn9iZXTuIDG6BztbLr EqmDZ+lxutzmLDL2LOA9wcnk6TiKirxcId9j95Evy3gVNObAe94xvQIw 5LLtjeyQqRvWM+SAg7aXOyugedYIJtxUBVg9P7AHlLU+Z5HSfXo8EeJ9 NgyrkVnNnJNyJ7n02qNiyCiNm0lrkglWTbEAt5iquR6KiLlKcrB6ml3c ZSqfTBv108Ev+iuL3W80kWJEpkwomPRVlF+2R4yCZt38kA0Xc0VBp4FR hTlGYA== ;; AUTHORITY SECTION: . 172800 IN NS ns2.neoip.com. . 172800 IN NS ns1.neoip.com. . 518400 IN RRSIG NS 8 0 518400 2021031117 2021022616 42351 . WTZU7GHTyNZvGFvc+avXpUgu26QDWaywDOoS0Ac8FQnuVnwvIbYpdoew jMJFmZ5b7rWdzlJ6NgwURxLX7/0EOSDYk3sTdnjK9RtQbVtEBCueiSF4 3xkFNILgmiCYuoLQLHNpue/ORvEPMQUYif33KLoSgoX+qMLEqjrp14E0 qKmDCErjHkrV3uqRmvix5psxLSebhCz4WJeqPC3kIi6OcfGMQO5siI4L gVNnw9Hmal7W9UJGokDbhcsnb51Q43rGlrfp6pBosiWYfJDys9YWg4jU JUeShUFLH74SqavH+jQ0FsPoi5Vzbtfua3GUs0T67J2TpctlOjUBD3oz yX1g9g== ;; ADDITIONAL SECTION: ns2.neoip.com. 172800 IN A 64.202.189.47 ns1.neoip.com. 172800 IN A 45.83.41.38 ;; Query time: 21 msec ;; SERVER: 198.41.0.4#53(198.41.0.4) ;; WHEN: Fri Feb 26 11:12:15 PST 2021 ;; MSG SIZE rcvd: 719 ___ dns-operations mailing list dns-operations@lists.dns-oarc.net https://lists.dns-oarc.net/mailman/listinfo/dns-operations
Re: [dns-operations] Speaking of fixing things...
Hi, Victor, Would you mind checking the list for domains with broken signed delegations to anything matching *.domaincontrol.com (GoDaddy's nameservers), including categorization (e.g. lame NS, vs non-lame NS with broken signature)? My suspicion is there may be a bunch of lame delegations, and knowing which TLDs (and if possible domains!) would be greatly appreciated. Cleaning up lame delegations is neither easy nor fast, but we do want to actually clean them up. (The root issue is there is currently no path for the delegatee to get the lame delegation removed. None. Nada. :-( ) Thanks, Brian On Thu, Oct 29, 2020 at 10:59 PM Viktor Dukhovni wrote: > I have a list of ~69k domain names with extant DS RRsets, where the > DNSKEY RRset has been either unavailable or failing validation for 180 > days or more (92k domains if the bar is set to 90 days). These span 439 > TLDs! Of these domains, ~30k are simply lame and zone apex NS lookups > fail even with CD=1. The remaining ~39k likely have DNSSEC-specific > misconfiguration. > > The top 25 TLDs by count of long-term dead signed delegations are: > > 24742 com >9258 nl >5357 se >4553 cz >2897 net >2763 eu >2044 pl >1661 org >1070 no >1035 hu > 992 fr > 916 nu > 731 uk > 701 info > 594 be > 562 ch > 557 xyz > 552 de > 421 es > 349 sk > 346 dk > 321 app > 282 io > 250 biz > 240 pt > > If any of the TLDs have policies that allow the deadwood to be delisted > (still registered, but not delegated) I can provide the list of > domains... It would be nice to see less breakage in the live zones. > > -- > Viktor. > ___ > dns-operations mailing list > dns-operations@lists.dns-oarc.net > https://lists.dns-oarc.net/mailman/listinfo/dns-operations > ___ dns-operations mailing list dns-operations@lists.dns-oarc.net https://lists.dns-oarc.net/mailman/listinfo/dns-operations
Re: [dns-operations] [Ext] DNS Flag Day 2020 will become effective on 2020-10-01
On Fri, Sep 11, 2020 at 1:01 PM Vladimír Čunát wrote: > On 9/11/20 9:14 PM, Randy Bush wrote: > >> The main issue with having the discussion on github, is that it is a > >> discussion on github, not on a major mailing list involving the > >> operators and folks doing independent implementations. > > for cabals which like a bubble, this is a feature, not a bug > > Are you telling me that Flag Day 2020 got too little publicity in here > and similar circles? (its web, linked to GitHub, the plans, etc.) I > rather thought we've pushed it everywhere often enough to make anyone > sick of the topic, but perhaps my perception is biased. I'm really > sorry if anyone feels excluded from the discussion. To be clear, we've > had multiple "Flag Day 2020" threads just on this list. > TL;DR: Yes, or rather the content of that discussion was not necessarily raised adequately in other venues, IMNSHO. The main participants on that github site appear to not have had enough breadth and depth of experience on networks, low-level transports, and all the roles in the DNS ecosystem, to collectively make supportable conclusions/decisions. E.g. "voting" based on participant's opinions is neither justifiable nor sensible, especially if there are only five voters. The majority of those votes appeared to be suffering from "group think" after reading the paper discussing the range of 12xx to 1400. In particular, things really went off the rails when discussing using client-side defaults lower than current defaults, i.e. Paul Vixie's suggestion to leave the offered client bufsize as is, and only change the server-side configured max size. The discussion around that mostly devolved to vendors/implementers defending past engineering/implementation choices (that's how we did it), without adequately considering the actual real-world impact. The deployment to client side machines will be a slow roll, for sure. But the difficulty, time, and pain, to roll that *back* when it is discovered as a major operational PITA and cost for authority operators, has been completely overlooked. In short: I would be perfectly okay if the recommendation were ONLY for the authority (and server side of resolvers) to lower their default configured UDP bufsizes, at which point having a range of recommended values (rather than a single value) would be more appropriate. Server-side defaults can have their values changed (overridden) by config changes, but that ONLY has effect if the clients are NOT ALSO implementing the SAME values. That's the problem: EDNS0 UDP Bufsize negotiation allows different values to be configured/offered, and uses the MINIMUM value. If both ends have their defaults lowered, and that causes a problem, it CANNOT be fixed unilaterally. Even only considering the recursive resolver population (estimated at ~3M), this is a huge issue, and IMHO a huge mistake. The analysis of the relative impact (e.g. N x cost for TCP) ignores things like state exhaustion, where the state CANNOT be increased (since port 53 is mandated, and server IP addresses are hardcoded everywhere). You cannot add IP addresses or ports to fix state exhaustion, which can be a localized issue on anycast operated networks. Sorry for the long message, but it is really a big deal, and the timeline is unfortunate. I'd suggest pushing the date back by a month or two minimum, and re-opening discussion on these issues on the github site. Or, discuss them here with a wider set of participants. Brian ___ dns-operations mailing list dns-operations@lists.dns-oarc.net https://lists.dns-oarc.net/mailman/listinfo/dns-operations
Re: [dns-operations] [Ext] DNS Flag Day 2020 will become effective on 2020-10-01
To quote the late Douglas Adams, from HHGttG: > > “There’s no point in acting surprised about it. All the planning charts > and demolition orders have been on display at your local planning > department in Alpha Centauri for 50 of your Earth years, so you’ve had > plenty of time to lodge any formal complaint and it’s far too late to start > making a fuss about it now. … What do you mean you’ve never been to Alpha > Centauri? Oh, for heaven’s sake, mankind, it’s only four light years away, > you know. I’m sorry, but if you can’t be bothered to take an interest in > local affairs, that’s your own lookout. Energize the demolition beams.” The main issue with having the discussion on github, is that it is a discussion on github, not on a major mailing list involving the operators and folks doing independent implementations. The other main issue is, that EDNS UDP size is negotiated. This means that it is NOT required that the default be the same on both the client and server. I would argue that which end should be lower, should depend largely on: 1. The number of potential places overrides are necessary 2. The comparative skill and expertise of the operators in those places 3. The log-log nature of distribution of volume of queries between the top-talkers (biggest recursives and biggest authorities) and the long tail 4. The position in the ecosystem of the various software elements, i.e. client-recursive vs recursive-root vs recursive-TLD vs recursive-leaf-auth There is asymmetry involved, when a lot of small-ish clients have new defaults that end up triggering excessive TCP traffic, which disproportionately impacts the authority server operators who then have no control over the situation. As such, I would greatly prefer the recommendation be lifted to a higher number, that is supported by the data, which achieves the following simultaneous goals: - Minimize probability of fragmentation (to approximately 0.1% or 0.01% or even lower) - Minimize the resulting degree of TCP traffic triggered by DNS responses that exceed the UDP size negotiated To me, that means maximizing the UDP size within a reasonable range of observed data points with similar-enough behavior. >From a theoretical perspective, I would be surprised if 1452 wouldn't work, but the data suggests otherwise, from what I recall. (1452 = 1500 - 8 bytes for MPLS or 802.1q, twice, plus L2 (ethernet frame) encapsulation over MPLS, plus IP-in-IP encapsulation, either/or (twice). If 1400 will work pretty much as well as 1232, I really want to encourage re-evaluating consensus regarding the Flag Day 2020 number. Brian P.S. Maybe we could call this "frag day" instead? Apologies if anyone finds that term offensive for any reason. But this is all about fragmentation, and "frag" is much less of a mouthful, and it rhymes with "flag". On Fri, Sep 11, 2020 at 9:46 AM Vladimír Čunát wrote: > On 9/11/20 4:44 PM, Paul Hoffman wrote: > > If this is really just a vendor-driven flag day, please be clearer about > that on the web page. > > The GitHub repo and other places have been open for *everyone* to > participate in the discussions. That's how I understand the "we", > similarly to "we DNSOP". Yes, the final number was not unanimous, but > such a thing rarely happens. And yes, I think it's true that > open-source resolver vendors were the most active there. > > --Vladimir > > ___ > dns-operations mailing list > dns-operations@lists.dns-oarc.net > https://lists.dns-oarc.net/mailman/listinfo/dns-operations > ___ dns-operations mailing list dns-operations@lists.dns-oarc.net https://lists.dns-oarc.net/mailman/listinfo/dns-operations
Re: [dns-operations] [Ext] Nameserver responses from different IP than destination of request
On Mon, Aug 31, 2020 at 5:09 PM Paul Hoffman wrote: > On Aug 31, 2020, at 2:47 PM, Viktor Dukhovni > wrote: > > > > Quite likely the domains that are completely broken (none of the > > nameservers respond from the right IP) are simply parked, and nobody > > cares whether they they actually work or not. > > > > The only reason you're seeing queries for them may be that folks doing > > DNS measurements, query all the domains we can find including the parked > > ones that nobody actually cares to have working. > > These assumptions seem... assumptiony. I'd love to see some data from > anyone who is collecting it on which NS names or IPs are exhibiting the > behavior. > I don't disagree, but the data would really only be visible to anyone on-path or at either end of the resolver-to-authority transaction. I think the only way to get meaningful data would be an active experiment, involving an authority server (or set of servers) for a domain set up just this way. That is the kind of thing that Geoff and George are good at, so if they want to do such an experiment and let us all know the results, I think that would be interesting. But I can't compel them to do that, and absent them choosing to do that, I think the general consensus is it's fine to let the broken stuff be broken. (The interesting result would be on the resolver side, as to which resolvers, if any, accept broken answers, and if possible, inferring the resolver operator's software being used.) Brian ___ dns-operations mailing list dns-operations@lists.dns-oarc.net https://lists.dns-oarc.net/mailman/listinfo/dns-operations
Re: [dns-operations] prefetching and thundering herds
TL;DR: I think the main issue is, to make sure that any caching "stubs" (e.g. resolvers in forward-only mode) do NOT do pre-fetch, but rather only query for expired entries when a natural query from the forwarder's client(s) (e.g. from an application on the host) occurs. That should, in principle, prevent thundering herd from client to recursive. Doing the opposite (prefetch by forwarder) would definitely cause thundering herd behavior, and likely cause significantly degraded performance from the client applications' perspective. (UDP, network queues, hardware queues, retries, and all that fun stuff being triggered with greater and greater synchronization over time.) Brian On Wed, Jul 15, 2020 at 4:50 AM Tony Finch wrote: > I've been wondering about the effects of stub resolvers with caches as > clients of recursive servers. To what extent do they cause a thundering > herd effect where all the cache entries expire with the same deadline? > The herd will arrive when the RRset expires so most of those clients will > hit maximum latency and stress the server's query deduplication mechanism. > > (I don't think I have enough traffic to get a useful answer from my > servers right now.) > > If thundering herds happen, do they thunder enough to help explain the > lack of benefit from prefetching observed by PowerDNS? > > Or maybe is the herd is too small to thunder? Instead there's a more > gentle swell of queries after the TTL expires? > > https://lists.dns-oarc.net/pipermail/dns-operations/2019-April/018605.html > > If there is much of a herd, would it make sense to give some proportion of > the clients a slightly reduced TTL so that they will trigger prefetch > before the rest of them requery? > > Tony. > -- > f.anthony.n.finchhttp://dotat.at/ > Bailey: Southwest 4 or 5, increasing 6 or 7 later. Moderate or rough, > occasionally very rough later in far northwest. Drizzle, fog patches. > Moderate > or poor, occasionally very poor. > ___ > dns-operations mailing list > dns-operations@lists.dns-oarc.net > https://lists.dns-oarc.net/mailman/listinfo/dns-operations > ___ dns-operations mailing list dns-operations@lists.dns-oarc.net https://lists.dns-oarc.net/mailman/listinfo/dns-operations
Re: [dns-operations] any registries require DNSKEY not DS?
On Fri, Apr 17, 2020 at 12:57 PM Olafur Gudmundsson wrote: > > > On Jan 22, 2020, at 11:16 PM, Paul Vixie wrote: > > On Thursday, 23 January 2020 02:51:28 UTC Warren Kumari wrote: > > ... > > If the parent makes the DS for me from my DNSKEY, well, then the DS > suddently "feels" like it belongs more to the parent than the child, > but this is starting to get into the "I no longer know why I believe > what I believe" territory (and is internally inconsistent), so I'll > just stop thinking about this and go shopping instead :-) > > > as you see, the DS RRset is authoritative in the parent, in spite of its > name > being the delegation point, which is otherwise authoritative only in the > child. so, DS really is "owned by" the delegating zone, unlike, say, NS. > > historians please note: we should have put the DS RRset at $child._dnssec. > $parent, so that there was no exception to the rule whereby the delegation > point belongs to the child. this was an unforced error; we were just > careless. > so, example._dnssec.com rather than example.com. > > -- > Paul > > > Paul, > If start talking about history and looking back with hindsight > > IMHO the second biggest mistake in DNS design was to have the same type in > both parent and child zone > If RFC1035 had specified DEL record in parent and NS in child or the other > way around it would have been obvious to > specify a range of records that were parent only (just like meta records) > thus all resolvers from the get go would have known that types in that > range only reside at the parent. > …… > If we had the DEL record then that could also have provided the glue hints > and no need for additional processing, > Would the method have potentially been to have GLUEA and GLUE records rather than effectively overloading the A/ status (authoritative vs not)? And then all of the new types that live only in the parent, could have been signed. I'm guessing it's way to late to start doing that now, without rev'ing all of DNS to v2. Brian > > You may recall that in 1995 when you and I were trying to formalize for > DNSSEC what the the exact semantics of NS record were, then you and Paul > Mockapetris came up with > “Parent is authoritative for the existence of NS record, Child is > authoritative for the contents” > > > Just in case you are wondering what was the biggest mistake that is QR > bit, recursion should have been on a different port than Authoritative. > > But this is all hindsight based on 30 years of coding and operational > difficulties. > > Regards, > Ólafur > > ___ > dns-operations mailing list > dns-operations@lists.dns-oarc.net > https://lists.dns-oarc.net/mailman/listinfo/dns-operations > ___ dns-operations mailing list dns-operations@lists.dns-oarc.net https://lists.dns-oarc.net/mailman/listinfo/dns-operations