First off: thanks for the careful review with proposals for better wording! Notes below.
On Oct 28, 2025, at 01:35, Martin J. Dürst <[email protected]> wrote: > I'm not listing minor grammatical mistakes, of which I have found quite a > few. These can be dealt with by the RPC. Feel free to send those to me off-list so that I can make the RPC's job easier. (Most of the list doesn't know this, but Martin has helpfully made major and minor suggestions on quite a few of my drafts, all to their betterment, for more than 25 years.) > Content, major: The draft needs to say that RFCs are written (mainly/mostly) > in English. I know this was discussed, but I haven't seen the main argument, > namely that we define policy and that this is policy. And if this isn't > policy, then nothing in this draft is. The WG earlier decided that that policy already belongs to the RFC Editor, and is already reflected in their Style Guide. Some of the concern, which I agree with, is "what is English?" and whether trying to define that anywhere benefits anyone. If at some point, one of the streams wants to publish an RFC that is not in English, that stream will have to have a (likely contentious) talk with the RFC Editor. To reiterate: 9280bis, which this WG has already approved, leaves these types of decisions to the RPC, with the understanding that the RPC has been quite transparent with the community when issues have come up. > Editorial, major: The abstract should be written so that it can be read even > in 10 or 20 years, which means it should not contain (and in particular > shouldn't start) with historic references. As a start, the first paragraph of > the abstract should move to the introduction, and the first two sentences of > the introduction should in turn move to the abstract. After that, a bit of > cleanup will be needed. Thanks, I like this. When we start a draft, it's for the WG; by the time it's done, it should be for future readers. > Content, major: Section 2 is entitled "Basic Requirements for Text in RFCs". > But the way it's written, it contains requirements for "readers and > browsers", people, maybe fonts, and searches. The text should be rewritten to > actually talk about text in RFCs. As an example, instead of "RFCs should be > displayed correctly across a wide range of readers and browsers.", write > "RFCs should only contain text that can be displayed correctly across a wide > range of readers and browsers.". Similar for the rest of the section. Agree. > Content, major: Section 3: "There are many Unicode characters that obviously > cannot be displayed (such as control characters), and many whose ability to > be displayed is debatable.": It's unclear what "many whose ability to be > displayed is debatable." means. I'd guess it refers to scripts and characters > standardized recently, for which font support is still thin. If that's what > is meant, please say so; if something else is meant, please make clear what > that is. There is a wide variety of things that can be debatable. Are combining characters like U+0315 (COMBINING COMMA ABOVE RIGHT) displayable? What about non-spacing marks like U+0650 (ARABIC KASRA)? I am sure people would take each side of the debate ("I can see the symbol printed in the Unicode Standard" vs. "I can't see that code point on my laptop even though it has quite a complete font set" and so on). > Content, major: Section 3 points to BCP137 for various notations. These are > all numeric. There are many places where numeric notation is appropriate. But > RFC7997 also recommends the use of Unicode character names. I see no reason > to change this, as support for this is also available in RFC2XML. In some > cases (see also below), character names make an RFC more readable because > they reduce additional lookups. (I have nothing against mentioning that in > some cases, Unicode character names contain errors, and in these cases, an > official alias should be used.) Yep, there seems to be rough consensus for this on the list, and I'll make that change. > Content, major (same paragraph): "If an RFC includes such characters in > normative or descriptive text, the RFC needs to also clearly describe the > character.": There may be cases, in particular for the correct display of > examples including bidirectional text in plain text, where we want to use > bidi control characters but we do not want to "describe" them (because they > are not needed in HTML or PostScript). Why would we not want to describe them? We are quite sure that some people reading the RFC will have them displayed R-to-L, and others L-to-R. > Content, major: 3.1 Names: This section confuses ASCII and Latin script. If > you look at recent RFCs such as RFC 9694 (sorry, that was just the example > that was easiest for me to find), the name is there in Latin script (M.J. > Dürst at the top, Martin J. Dürst at the end), without an "ASCII > interpretation". And there would be no point to force me to add an "ASCII > interpretation" next time I write an RFC. So please change "These authors can > give their names using only ASCII characters, or as Unicode characters and an > ASCII interpretation of their name." to > "Authors can give their names using only Latin script characters, or using > non-Latin script and an equivalent in Latin script." Please note that this > includes e.g. somebody (fictional) with a name of 加藤 竜太郎 with a Latin (not > ASCII) equivalent of Ryūtarō Katō (if the person prefers this to the simpler > Ryutaro Kato). Please also note that I'm using "equivalent", not > "interpretation". There's no interpretation involved. Yep, good change. > Editorial, medium: Please remove "Authors of RFCs whose names include > non-ASCII characters will likely have preferences for how their names are > displayed based on their lived experiences." People, including authors, just > have names. I fully disagree that authors don't have preferences. In fact, at various times in the past, you have had different preferences about the spelling of your surname in IETF documents. :-) In particular, some authors with Han / Kanji names have asked that their names be spelled with Latin characters, other have asked for their names to only be spelled with Han / Kanji, and yet others want both (often with the Latin of their family name in all caps). These are preferences that I think should be acknowledged and honored when sensible, even if bugs some other people. > Content, major: "Company names and geographic names generally do not need > ASCII interpretations, but they can be included at the discretion of the > author and the RPC.": This would mean that I could give my affiliation as > 青山学院大学 and my address as 相模原、日本 or so, but it surely can't be what we want. If that's what the author of an RFC and their stream manager wants, then it is indeed what we want. The RPC can disagree, but that disagreement is on a case-by-case basis, not colored by this document. > Content, major: RFCs currently use last (family) name plus initial(s) in many > places, and we should change this (as a matter of policy if necessary). The > reason is that there are many people where the family name isn't very > informative. This is very frequent for Koreans, Chinese, and Danish. It can > also happen in other cultures. I fully agree, but that's a topic for the Style Guide, not this document. If you start a thread about this on rfc-interest@, I would certainly participate. > Editorial, minor: 3.2 Examples: "giving the Unicode equivalent of the > non-ASCII characters": This is confusing because these characters will be in > UTF-8 and therefore will use Unicode. What we want to say is to use Unicode > code points or Unicode character names. Yep, good catch. > Editorial, major: When talking about color, the text says "If so, those > examples need to also include the "U+NNNN" syntax.". This excludes the > possibility to use Unicode character names. But as has been discussed in > previous mail, in the example at hand, it would be much more helpful for the > reader to replace 'For example, "A color display should be able to > differentiate 🔴 (U+1F534), 🟢 (U+1F7E2), and 🔵 (U+1F535)."' with 'For example, > "A color display should be able to differentiate 🔴 (LARGE RED CIRCLE), 🟢 > (LARGE GREEN CIRCLE), and 🔵 (LARGE BLUE CIRCLE).", because it saves somebody > with a black-and-white display some lookups. Yep, there have been lots of agreement on the list about using names and U+NNNN here. > Content, major: 5. Security: "Valid Unicode that matches the expected text > must be verified in order to preserve expected behavior and protocol > information.": It's totally unclear what this means, and who should deal with > it. Maybe this should read "Authors and the RPC should cross-check that the > used characters match their code point numbers or Unicode character names." > If something else is intended, please make clearer what it is. I think that is what was intended, and your wording is clearer. > Editorial, minor: The reference label "[UnicodeCurrent]" should be changed to > "[UnicodeLatest]", because that will help people who are familiar with > Unicode terminology. Excellent! > In the reference section, the year should be removed because that's how the > Unicode Consortium advises to cite the latest version, see e.g. "Version > References" at https://www.unicode.org/versions/Unicode17.0.0/. If the RFC > Editor doesn't allow to remove the year, then at least 2025 should be used > (currently 2023). Agree, but it would be even better with no year. The RPC has a references specialist (hi, Ted!), and I'm sure that he would be interested in this. This is a topic for rfc-interest@; I'll start it there. > Content, minor: "in Normalization Form C (NFC) as defined in [UnicodeNorm]": > I recently learned this by accident, but Unicode Standard Annex #15 does no > longer actually define normalization. Paragraph 3 of the Introduction says > "For the formal specification of the Unicode Normalization Algorithm, see > Section 3.11, Normalization Forms in [Unicode].". So please change this at > least to "in Normalization Form C (NFC) as defined in Section 3.11, > Normalization Forms, in [UnicodeLatest] and [UnicodeNorm]". Accidents with Unicode are so fun... > Editorial, minor: For [UnicodeNorm] (if it's kept), change > 'The Unicode Consortium, "Unicode Standard Annex", 2023' to > 'The Unicode Consortium, "Unicode Standard Annex #15, Unicode Normalization > Forms", 2025'. Will do. Thanks again! --Paul Hoffman
smime.p7s
Description: S/MIME cryptographic signature
-- rswg mailing list -- [email protected] To unsubscribe send an email to [email protected]
