On 10/17/24 18:42, John Levine wrote: > It appears that Daniel K. <[email protected]> said: >> RFC 7489, in the DMARC XML Schema of Appendix C, only allowed the full >> IPv6 address textual representation without zero bit compression (::) >> >> ([A-Fa-f0-9]{1,4}:){7}[A-Fa-f0-9]{1,4} > > I agree that is wrong. > >> I think it is prudent to specify what values we want to see in those >> reports, and that is, in my opinion, values matching the canonical >> textual representation format specified by RFC 5952. > > Having done this kind of stuff before, I can report that trying to make > the syntax as super strict as possible is rarely a good idea. For one > thing, it's hard to get right (see above) and for another, it means > that the only error message you ever see is "syntax error".
The fact that the regex in RFC 7489 does not allow the use of zero-bit compression "::", yet everyone uses it in the XML they send out, suggests to me that: 1) no-one is using that xsd for anything, or 2) everyone tried, failed, and updated their xsd with a better regex. > I'd rather have a loose regex and let whatever is parsing the addresses > complain > if they're bad. A well written IPv6 handler will also know that all of the > valid > IPv6 addresses are in the 2xxx: range and will reject link local, ULA, and > other > invalid stuff. For the same reason a well written IPc4 handler will reject > 0.x.x.x 10.x.x.x and 127.x.x.x. I'm not suggesting the regex to prohibit all kinds of private, link-local, and similar globally unroutable addresses. >> RFC 5952 has more to say about leading zeroes being disallowed in a >> 16-bit field, unless the field contains a single zero. This is not >> disallowed by the proposed regex. > > Indeed it says that, but it is obvious what 2001:0123:: means and it's silly > to > reject it. > > I would also accept addresses like ::ffff:12.34.56.78 which can appear in logs > when a single server handles both v4 and v6 addresses. Again, it's obvious > what > it means, what benefit is there to rejecting it? Experience from RFC 7489, indicates to me that we should not worry too much about being too restrictive. In the DMARC aggregate reports I've received over the years, all IPv6 addresses have been in the canonical format described by RFC 5952. * There are no instances of leading zeroes * There are no instances of anyone supplying IPv4-mapped IPv6 addresses or other unroutable IPv6 addresses * There are no instances of using upper case letters This change would help cement the canonical textual representation format specified in RFC 5952's standing as the accepted format for IPv6 addresses in DMARC reports. Should this not be persuading, and we instead want to change to a more permissive regex as suggested upthread, I suggest that we at least mention RFC 5952 and strongly encourage it, maybe like this: --- a/draft-ietf-dmarc-aggregate-reporting.md +++ b/draft-ietf-dmarc-aggregate-reporting.md @@ -135,6 +135,9 system. For each IP address that is being reported, there will be at least one "record" element. Each "record" element will have one "row", one "identifiers", and one "auth_results" sub-element. Within the "row" element, there MUST be "source_ip" and "count". +The value in source_ip SHOULD either be a globally routable IPv4 unicast address +in the dotted-decimal format, or a globally routable IPv6 Global Unicast address +in the canonical textual representation format; see RFC 5952 for details. There MUST also exist a "policy_evaluated", with sub-elements of "disposition", "dkim", and "spf". There MAY be an element for "reason", meant to include any notes the reporter might want to include as to why the "disposition" policy Daniel K. _______________________________________________ dmarc mailing list -- [email protected] To unsubscribe send an email to [email protected]
