Joe, Int-area, I had started writing a review draft-ietf-intarea-ipv4-id-update-02 sometime ago, and picked it up again late last week. Today, I noticed rev -03 posted, but this review being almost finished I figured it would not hurt to complete it and hit "Send".
I am hoping this is useful, and that potential issues that are already addressed or are no longer applicable in -03 do not burn many extra cycles. Comments grouped in more substantive and more editorial. More Substantive: ~~~~~~~~~~~~~~~~ Abstract This document updates the specification of the IPv4 ID field in RFC 791, RFC 1122, and RFC 2003 to more closely reflect current practice and to more closely match IPv6 so that the field is defined only when a datagram is actually fragmented. As written, because in-flight IPv4 fragmentation is supported, does this last statement mean that an the IP-ID of an unfragmented datagram at the source is undefined, and takes semantics when fragmented -- potentially in-the-network? 1. Introduction In IPv4, the Identification (ID) field is a 16-bit value that is unique for every datagram for a given source address, destination address, and protocol, such that it does not repeat within the Maximum Segment Lifetime (MSL) [RFC791][RFC1122]. Although this document is quite strict at definition of terms and specifications, there is the assumption that MSL applies to all datagrams (even without segments) although defined for TCP. For some protocols the max datagram lifetime can be much smaller... As this is attempting to modify the IP-ID at the IP Layer, is the assumption of this MSL for everything valid? 4. Uses of the IPv4 ID Field Should this be renamed to be qualified "Potential", "Suggested", or "Theoretical" uses? The IPv4 ID field was originally intended for fragmentation and reassembly [RFC791]. Within a given source address, destination address, and protocol, fragments of an original datagram are matched based on their IPv4 ID. This requires that IDs are unique within the address/protocol triple when fragmentation is possible (e.g., DF=0) or when it has already occurred (e.g., frag_offset>0 or MF=1). Similarly, this is during the max lifetime. Has the likely max lifetime decreased? Does this update the TTL definition? The IPv4 ID field can also be used to validate payloads of ICMP responses as matching the originally transmitted datagram at a host [RFC4963]. In this case, the ICMP payload - an IP datagram prefix - is matched against a cache of recently transmitted IP headers to check that the received ICMP reflects a transmitted datagram. At a "ICMP responses" or errors? In any case, I seem to recall reading arguments against this on the basis that an ICMP generation is not time bound, it can arrive whenever, etc. Also RFC4963 does not discuss this particular "use". tunnel ingress, the IPv4 ID enables returning ICMP messages to be matched to a cache of recently transmitted datagrams, to support ICMP relaying, with similar challenges [RFC2003]. Uses of the IPv4 ID field beyond fragmentation and reassembly require that the IPv4 ID be unique across all datagrams, not only when fragmentation is enabled. This document deprecates all such non- fragmentation uses. As defined today, that is the same as for the reassembly use. IPv4-ID as defined to be unique for the max lifetime of a datagram (RFC 791 does not say "MSL") per src/dst/proto for all datagrams, not only for when fragmentation is "enabled". 5. Background on IPv4 ID Reassembly Issues There is thus no enforcement mechanism to ensure that datagrams older than 120 seconds are discarded. Doesn't this statement moots most of the document? throughput over LANs (e.g., disk-to-disk file transfer rates), and numerous throughput demonstrations have been performed with COTS systems over wide-area paths at these speeds for over a decade. This strongly suggests that IPv4 ID uniqueness has been moot for a long time. Do these applications have a maximum packet lifetime of 120 seconds? Is this demonstrating actually that the max datagram lifetime is smaller for these? 6.1. IPv4 ID Used Only for Fragmentation o >> Originating sources MAY set the IPv4 ID field of atomic datagrams to any value. What is it actually gained by doing this? Things are lost though, like the ability to use the IP-ID in debugging and troubleshooting for packet identification by protocol analyzers and humans. o >> All devices that examine IPv4 headers MUST ignore the IPv4 ID field of atomic datagrams. Similarly, this is quite bold. Does Wireshark need to skip the IPv4 ID here? Deprecating the use of the IPv4 ID field for non-reassembly uses should have little - if any - impact. IPv4 IDs are already frequently repeated, e.g., over even moderately fast connections. Duplicate suppression was only suggested [RFC1122], and no impacts of IPv4 ID reuse have been noted. Routers are not required to issue ICMPs on any particular timescale, and so IPv4 ID repetition should not have been used for validation, and again repetition occurs and probably could have been noticed [RFC1812]. ICMP relaying at tunnel ingresses is specified to use soft state rather than a datagram cache, and should have been noted if the latter for similar reasons [RFC2003]. This makes a lot of sense. 6.2. Encourage Safe IPv4 ID Use I think that these reasons and this section, as per below, are not compelling we should not be encouraging this... o >> The IPv4 ID of non-atomic datagrams MUST NOT be reused when sending a copy of an earlier non-atomic datagram. The IP layer can duplicate a packet and... This overlap is noted as the result of reusing IPv4 IDs when retransmitting datagrams, which this document deprecates. Overlapping fragments are themselves a hazard [RFC4963]. As a result: o >> Overlapping datagrams MUST be silently ignored during reassembly. There are other legit reasons for overlapping datagrams, like duplicates taking different paths... o >> The IPv4 ID field of non-atomic datagrams, or protected atomic datagrams MUST NOT change in transit; the IPv4 ID field of unprotected atomic datagrams MAY be changed in transit. Why would you allow the change for atomic datagrams? Again, what's gained? Is there such a case that this happens today (not in a NAT)? 7. Impact on Datagram Use o >> Sources of non-atomic IPv4 datagrams MUST rate-limit their output to comply with the ID uniqueness requirements. I do not think this can be mandated, when a source cannot know the maximum datagram lifetime. Certainly not in such a peanut-butter way. Such sources include, in particular, DNS over UDP [RFC2671]. What is the "MSL" for DNS over UDP? o >> Higher layer protocols SHOULD verify the integrity of IPv4 datagrams, e.g., using a checksum or hash that can detect reassembly errors (the UDP checksum is weak in this regard, but better than nothing), as in SEAL [RFC5320]. This seems be to an out of scope requirement as per Abstract. This document changes RFC 791 as follows: o IPv4 ID uniqueness applies to only non-atomic datagrams. This means that a source can set IP-ID to zero for all atomic datagrams. Is this a better state than the current one? Protocols doing PMTUD [RFC 1191] would source atomic datagrams, and sometimes it is useful to humans reading a packet capture to follow IP-IDs (in packets and in headers embedded in ICMPs). o Non-atomic IPv4 datagrams retransmitted by higher level protocols are no longer permitted to reuse the ID value. So hosts that today do not care about the IP-ID on retransmissions, now they need to start checking so that they do not reuse? 8.2. Updates to RFC 1122 o The IPv4 ID field is no longer repeatable for higher level protocol retransmission. "repeatable" as in it needs to check now that it does not repeat? Instead of relaxing, this potentially adds a requirement. o IPv4 datagram fragments no longer are permitted to overlap. This is like saying "Eclipses are no longer permitted to occur." 9. Impact on NATs and Tunnel Ingresses Tunnel ingresses act as sources for the outermost header, but tunnels act as routers for the inner headers (i.e., the datagram as arriving at the tunnel ingress). Ingresses can fragment as originating sources of the outer header, because they control the uniqueness of that IPv4 ID field. They need to avoid fragmenting the datagram at the inner header, for the same reasons as any intermediate device, as noted elsewhere in this document. Is this trying to implicitly abolish in-the-network IPv4 fragmentation? For the inner header, a Tunnel ingress acts like a router, and the Tunnel as a link. Therefore, why would it need to "avoid fragmenting the datagram at the inner header"? More generically, this document implies that host-only fragmentation is the only way and flirts with abolishing intermediate frag... 10. Impact on Header Compression I do not support this section. Seems to optimize the corner case at a very high expense. 11. Security Considerations This document attempts to address the security considerations associated with fragmentation in IPv4 [RFC4459]. Which of the security considerations there is addressed by the changes proposed in this document, specifically? More Editorial: ~~~~~~~~~~~~~~ Abstract The IPv4 Identification (ID) field enables fragmentation and reassembly, and as currently specified is required to be unique within the maximum lifetime on all datagrams. If enforced, this uniqueness requirement would limit all connections to 6.4 Mbps. Unique... for a given src/dst/proto tuple. All connections... given the above. Because this is obviously not the case, it is clear that existing systems violate the current specification. Is this a self-referencing statement? It is true because it is obviously not false? 1. Introduction The uniqueness of the IPv4 ID is a known problem for high speed devices; Do you mean "The lack of uniqueness"? Also, is it a "known problem" or is it a "document statement"? If the former, what is the actual problem happening in the wild? if strictly enforced, it would limit the speed of a single protocol between two endpoints to 6.4 Mbps for typical MTUs of 1500 bytes [RFC4963]. Is an "endpoint" defined as an IPv4 address? An endpoint (as a node, host, etc) is not really limited to this, if using multiple addresses, limiting the max datagram lifetime for a given protocol, etc. This is a long MSL BTW. 3. The IPv4 ID Field IP supports datagram fragmentation, where large datagrams are split into smaller components to traverse links with limited maximum transmission units (MTUs). Is this tangling path vs. link MTU? o In IPv6, fragments are indicated in an extension header that includes an ID, Fragment Offset, and MF flag similar to their counterparts in IPv4 [RFC2460] In IPv6, the More Fragments is the "M flag", not the "MF flag". 5. Background on IPv4 ID Reassembly Issues With the maximum IPv4 datagram size of 64KB, a 16-bit ID field that does not repeat within 120 seconds means that the aggregate of all TCP connections of a given protocol between two endpoints is limited to roughly 286 Mbps; at a more typical MTU of 1500 bytes, this speed drops to 6.4 Mbps [RFC4963]. This limit currently applies for all Same as before, "endpoint" here used as "src/dst IPv4 address pair". 6. Updates to the IPv4 ID Specification o Atomic datagrams: datagrams not yet having been fragmented (MF=0 and fragment offset=0) and for which further fragmentation has been inhibited (DF=1), i.e., as a C-code expression: Doesn't this need a license? // formality 6.3. IPv4 ID Requirements That Persist This document does not relax the IPv4 ID field uniqueness requirements of [RFC791] for non-atomic datagrams, i.e.: o >> Sources emitting non-atomic datagrams MUST NOT repeat IPv4 ID values within one MSL for a given source address/destination address/protocol triple. But RFC 791 does not talk about "MSL". In specific, DF=1 prevents fragmenting datagrams that are integral. DF=1 also prevents further fragmenting received fragments. Fragmentation, either of an unfragmented datagram or of fragments, is current permitted only where DF=0 in the original emitted datagram, and this document does not change that requirement. I want to make sure I follow this logic. What does "in the original emitted datagram" mean? 9. Impact on NATs and Tunnel Ingresses Network address translators (NATs) and address/port translators (NAPTs) rewrite IP fields, and tunnel ingresses (using IPv4 encapsulation) copy and modify some IPv4 fields, so all are considered sources, as do any devices that rewrite any portion of the source address, destination address, protocol, and ID tuple for non- atomic datagrams [RFC3022]. As a result, they are subject to all the requirements of any source, as has been noted. Why only non-atomic? Also, outer, and not inner, correct? o >> NATs MUST ensure that the IPv4 ID field of datagrams whose address or protocol are translated comply with requirements as if the datagram were sourced by the NAT. In which direction? Thanks, -- Carlos. _______________________________________________ Int-area mailing list Int-area@ietf.org https://www.ietf.org/mailman/listinfo/int-area