> On 13 Jul 2020, at 23:35, Tony Finch <[email protected]> wrote:
> 
> I've had a read through and here are a few, er, I mean several things that
> caught my eye:
> 

Hi Tony, 

Many thanks for the detailed review!

> 
> In the intro, I think it's too strong to say that RFC 5155 was "to
> prevent" zone enumeration - its abstract says it "provides measures
> against" which is a more accurate guide to NSEC3's effectiveness.
> Also the
> same paragraph could probably be more clear that NSEC5 is not a practical
> thing (yet? or likely ever?). I.e., neither of them are really useful
> privacy mechanisms.

Yes - agree this could be more specific on both topics. Will re-word as 
suggested.

> 
> 4.2 IXFR - RFC 1995 doesn't use RFC 1123-style requirements keywords (and
> obviously it predates RFC 2119) so I don't think you can say the
> lower-case "should" is non-normative. Spelling "forth" -> "fourth" :-)

Both fixed.

> 
> The last paragraph in this section should have a cross-reference to the
> section that describes the new IXFR requirements in detail. If these
> requirements are supposed to apply to pure TCP as well as IXoT then it's
> probably worth promoting them to a top-level section to make it more
> obvious that they exist, independent of TLS. Apart from this paragraph,
> section 4 looks more like a non-normative summary of existing
> specifications, which is useful background information, but I think it's
> helpful to clearly separate normative and informative sections.

Agree. These requirements are meant to apply specifically to IXFR-over-TCP so I 
have created a separate top level section with the title ‘Update to RFC1995 for 
IXFR-over-TCP’ and moved the normative statements there.

> 
> 4.3 Is it worth discussing information leakage about which zones are
> present on a secondary? i.e. is that part of the threat model?

We didn’t include that in this threat model because that can in principle be 
discovered by active measurements of the DNS ….but it might be worth a sentence 
to explain that.

> 
> 5.3 I'm not sure I understand what this section is getting at. Is it
> saying that a client can have either an XoTCP or an XoTLS connection, but
> not both? Because it should try to limit itself to one connection of any
> kind for zone transfers?

Not at all - perhaps it needs re-wording. RFC77666 recommended client/server 
connections be:

* one TCP connection for regular queries
* one TCP connection for zone transfers
* one connection per other transport on top of TCP (the implication being this 
is used for everything)

As an author on RFC7766 I was surprised when I went back to read it and could 
not remember the specific rationale for it!

The intention in this section is to update this guidance to say that all 
connection based transport should use separate connections for regular queries 
and zone transfers, just like TCP. So in principle the client/server 
interaction _could_  look something like:

* one TCP connection for regular queries
* one TCP connection for zone transfers
* one DoT connection for regular queries
* one XoT connection for zone transfers
* one DoH connection for regular queries
* one XFR-over-DoH connection for zone transfers

Listing the potential connections out like this might make it more obvious? 

> 
> 5.4 What is the base DNS RCODE for non-XoT traffic on an XoT connection?
> (extended errors do not have a fixed association with RCODEs)
> What about non-EDNS queries?

Ah yes - we should have specified REFUSED here as the base RCODE - fixed.

> 
> 5.6.2 AXoT
> 
> In the keepalive discussion, is the intention that a server can use a
> timeout of 0 to abort a connection in the middle of a transfer, or is it
> supposed to indicate that there can be no more transfers on the
> connection, but existing transfers in progress are allowed to finish?

RFC7828 says "A DNS client that receives a response that includes the edns-tcp-
   keepalive option with a TIMEOUT value of 0 SHOULD send no more
   queries on that connection and initiate closing the connection as
   soon as it has received all outstanding responses."

The intention of a 0 keepalive timeout is to stop further use of the existing 
connection, if the server needs to terminate a particular AXFR immediately then 
it still needs to close the connection at its end. 

The mention of abort in this text is misleading I think. I suggest updating:

OLD:
Note that this requirement, combined
with the use of EDNS0 Keepalive, enables AXoT servers to signal the
desire to close a connection due to low resources by sending an EDNS0
Keepalive option with a timeout of 0 on any AXoT response (in the
absence of another way to signal the abort of a AXoT transfer).

NEW:
Note that this requirement, combined
with the use of EDNS0 Keepalive, enables AXoT servers to signal the
desire to close a connection (when existing transactions have competed) 
due to low resources by sending an EDNS0 Keepalive option with a 
timeout of 0 on any AXoT response. Aborting an AXFR during the transfer 
still requires the server to close the connection.

We did toy with the idea of defining a new EDNS0 option/extended error code for 
a server to signal an abort of an individual AXoT without closing the 
connection but weren’t convinced there was a use case but perhaps we should 
revisit this (as per your next point)…? 

> 
> Is there a reason for allowing concurrent AXFRs of the same zone?
> Actually, thinking about this more generally, I can't see a way in RFC
> 5936 for the server to impose backpressure to limit the number of
> concurrent AXFRs. And there isn't an extended error code for concurrency
> control or backpressure. If the server had a suitable response, that would
> allow it to control xfer resources in general, as well as to choose
> whether or not it wants to allow multiple AXFRs for the same zone at the
> same time.

I don’t believe RFC5936 says anything expliclty about concurrent transfer 
behaviour, and while there may not be a use case for it do you think we should 
actually prohibit it?

Of course a server can error any AXFR if it chooses [RFC5936]:

"To indicate an error in an AXFR response, the AXFR server sends a
   single DNS message when the error condition is detected, with the
   response code set to the appropriate value for the condition
   encountered.  Such a message terminates the AXFR session;…” 

so it _could_ already answer SERVFAIL if it didn’t have the resources?, or 
REFUSED if a transfer is already underway and it doesn’t want to do another 
one? I’m not actually sure what existing implementations do in this case? (will 
double check)

I suppose the advantage of adding an extended error code would be so that well 
behaved clients didn’t continue to request transfers that were going to be 
refused. 

> 
> Still 5.6.2
> 
> The connection re-use requirements seem to be restating 5.3 in more
> detail. Would it be more clear to put these related requierments in the
> same section?

Well 5.3 is a general update to RFC7766 about transport usage but 5.6.2 is 
specifically updating RFC5936… 

> 
> Re pipelining, I can't see in RFC 5936 whether concurrent AXFRs are just
> concurrent outstanding queries, with all the response messages for one
> zone sent back-to-back, or whether response messages for different
> concurrent AXFRs can be interleaved.

No, you are right - that behaviour isn’t explicitly specified there but the 
discussion around using message IDs to match responses at the end of section 
4.1.1. suggests/implies intermingling should work. Our draft doesn’t update 
RFC5936 at all (at the moment)… I hadn’t thought it necessary but perhaps we 
should actually make the normative statements around the updates to RFC1995 
apply to RFC5936 as well for consistency?

> 
> 5.6.3 padding
> 
> Why would empty response messages be needed? Isn't it enough to pad the
> regular response messages that contain RRs? (Or maybe reduce the number of
> RRs per message and increase the padding if more obfuscation is needed?)
> Servers need to keep track of zone sizes in order to mitigate
> CVE-2016-6170 (DoS attack by sending an excessively huge AXFR response) so
> I would expect servers to be able to use that accounting to decide how to
> spread padding between AXFR response messages, without the need for extra
> padding-only messages.

Adding that requirement to this document was a question of flexibility and 
future proofing to allow padding for AXFR to happen in different ways and with 
simple algorithms. e.g. the maximum size a (tiny) zone could be padded to would 
theoretically be limited (to something very large admittedly) if there had to 
be a minimum of 1 RR per packet. I can add some text to clarify this.

> 
> 5.7 IXoT
> 
> Looking back and comparing with section 4.2, it looks like the concurrency
> requirements in section 5.7 only apply to TLS. Are they supposed to apply
> to TCP as well?

The normative statements in section 4.2 (in particular, follow RFC7766) do 
require IXFR to support pipelining of queries and out of order processing and 
re-use of one TCP connection for zone transfers. Section 5.7.1 should have a 
sentence referencing back to that update which is it doesn’t at the moment - 
I’ll add it. 

The first paragraph in 5.7.1 should probably be moved to the new, earlier 
section on the normative update to RFC1995…. and paragraph 2 in section 5.7.2 
should probably be removed or apply to IXFR-over-TCP as well for consistency, I 
suppose. Keeping it seems preferable in order to remove any doubt about 
behaviour. 

Getting updates on top of updates for each type of transfer (or both) clear and 
consistent is a bit tricky here :-)

> 
> I think it would help to have some more explicit discussion of how IXoT
> and AXoT share a connection, wrt concurrency, interleaving of response
> messages (or not), and so forth. Perhaps as a subsection beween 5.5 and
> 5.6? Or maybe as an expanded 5.3? 

Are you thinking of some text clarifying that servers can send AXoT responses 
for different zones intermingled with each other and with IXoT responses and 
clients have to handle them? I guess I thought that was implicit in the RFC7766 
model but we could add some clarifying text. Again though, that would (I think) 
apply equally for AXFR and IXFR sharing a connection so perhaps it needs to 
appear earlier when they are discussed…. Do you have any error/problem cases in 
mind, or just clarifying what needs to be supported?


> Also covering other things that are
> common to IXot and AXoT like keepalive timeouts, concurrency backpressure,
> presence or absence of EDNS, padding, and anything else I've missed.

Well, at the moment the detailed discussion happens in the AXoT discussion for 
keepalive and EDNS0 and the IXoT section just references back to that. I tried 
moving it earlier and it seemed out of context and between the two seemed odd 
so I landed on this structure but I realise there is a lot of overlap. I expect 
the structure to evolve a bit again in the next version so thanks for all the 
feedback. Padding I think is better kept separate?

> 
> 7 authentication
> 
> It seems weird to mix up channel auth and data auth, since they are quite
> different things. As I understand it, ZONEMD isn't really authentication,
> it's just an integrity check (unless it is used in a signed zone). And if
> you are talking about data authentication it seems odd to leave out
> RRSIGs.
> 
> TSIG doesn't provide data authentication. It provides mutual
> authentication of the endpoints, and data integrity, but the server can
> lie to the client about the zone contents. (The server is not necessarily
> the ultimate authority for the zone.)
> 
> It would be useful to have terminology to distinguish between TLS where
> the client software tries on its own initiative, with fallback to TCP
> (which is what I think of when I read "opportunistic"); as opposed to TLS
> configured by the admin without fallback to TCP and without any client or
> server certificate auth. I'll call the latter "unauth".
> 
> I don't think strict TLS + TSIG adds any benefits beyond unauth TLS +
> TSIG, because TSIG already provides mutual auth. Well, there's some risk
> that the client may send requests to the wrong server, which goes back to
> my section 4.3 question about whether it is part of the threat model to
> worry about exposing which zones a client holds.
> 
> Mutual TLS is roughly comparable to unauth TLS + TSIG, but it has the
> advantage that it's a bit easier to set up in a way that prevents clients
> from being able to impersonate the server. If you want to do this with
> TSIG then every client needs its own key, and the server config has to be
> updated whenever a client is provisioned or decommissioned. With mutual
> TLS the server only needs a relatively static CA cert that can
> authenticate any client cert.
> 
> I think there should be something in the spec about how certificate
> subject names relate to how (in strict and mutual TLS) the client
> authenticates the server, and how (in mutual TLS) the server decides that
> the client's requests are authorized. I would like to be able to give my
> client a server name (and optional address) and have it authenticate the
> server using the system CA cert store and server certificate
> subjectAltNames. I would like to be able to give my server an ACL
> containing my private CA cert and a client cert subject name pattern.

For convenience, I’ll pull this topic out into a different thread if that is OK 
with you…?

> 
> 8 policies
> 
> I think the definition of xfer group can be slightly improved, like:
> 
>  We call the entire group of servers involved in XFR for a particular
>  set of zones (all the primaries and all the secondaries) the 'transfer
>  group' for those zones.
> 
> (My auth servers host multiple sets of zones belonging to several
> different institutions, with different and partially overlapping transfer
> groups, with different security configurations…)

Yup - that works.

> 
> I think "mTLS" should be written "Mutual" for consistency?

Sure. 

> 
> Finally, at last ...
> 
> The figures and tables were missing from the plain text version that I
> looked at so I didn't review them. I could guess what the diagrams showed
> but I got the impression that the table in section 7 was a bit more
> substantive.

At the moment they are just links to SVG images in the GitHub repo so in the 
plain text version you do need to copy and paste the URIs from section 16.3. 
(in the HTML you can click to have them open in a new tab). Attempting proper 
SVG integration (or failing that, reverting to ASCII art) is on the TODO list! 

Thanks.

Sara. 


_______________________________________________
dns-privacy mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/dns-privacy

Reply via email to