[Rswg] Re: [Ext] Last call comments on draft-rswg-rfc7997bis

Martin J . Dürst Sun, 02 Nov 2025 17:59:50 -0800

Hello Pete, everybody,

Two minor points inline.


On 2025-11-02 22:03, Pete Resnick wrote:

On 31 Oct 2025, at 7:57, Martin J. Dürst wrote:
On 2025-10-29 09:33, Paul Hoffman wrote:
On Oct 28, 2025, at 01:35, Martin J. Dürst <[email protected]>wrote:
Content, major: Section 3: "There are many Unicode characters thatobviously cannot be displayed (such as control characters), and manywhose ability to be displayed is debatable.": It's unclear what"many whose ability to be displayed is debatable." means. I'd guessit refers to scripts and characters standardized recently, for whichfont support is still thin. If that's what is meant, please say so;if something else is meant, please make clear what that is.
There is a wide variety of things that can be debatable. Arecombining characters like U+0315 (COMBINING COMMA ABOVE RIGHT)displayable? What about non-spacing marks like U+0650 (ARABIC KASRA)?I am sure people would take each side of the debate ("I can see thesymbol printed in the Unicode Standard" vs. "I can't see that codepoint on my laptop even though it has quite a complete font set" andso on).
On any decent browser, these should display without problems. When itcomes to editors, shells, and the like, the field is much wider, sothere are no absolute guarantees. But these are in Unicode sinceUnicode 1.0 or so, so I would expect these to show.
I will leave it to you and Paul to replace "debatable" with somethingclearer.

I'll gladly contribute to text once I have understood what we want tosay. Is it about formatting charcters such as bidi controls and thelike? Is it about characters added to Unicode very recently?

Content, major (same paragraph): "If an RFC includes such charactersin normative or descriptive text, the RFC needs to also clearlydescribe the character.": There may be cases, in particular for thecorrect display of examples including bidirectional text in plaintext, where we want to use bidi control characters but we do notwant to "describe" them (because they are not needed in HTML orPostScript).
But I'm not talking about RTL characters such as Hebrew and Arabic.I'm talking about BIDI control characters, which are invisible (exceptthat they may affect how the graphic characters close to them areordered. If we need to insert such characters, we shouldn'tnecessarily talk about these characters, but about how we expect themto reorder the rest of the text (so that readers can check whetherthey see the text in the order the author expected them to see it).
Chair hat off, a text suggestion: "If an RFC includes such characters innormative or descriptive text, the RFC needs to also clearly describethe character or, as in the case of some control characters, describethe effect of the character."


Good direction. I'd suggest a slight additional tweak:

"If an RFC includes such characters in normative or descriptive text,the RFC needs to also clearly describe the characters or, as in the caseof some control characters, describe the effect of the characters."

Using the plural here makes it easier to understand that in some cases,it may be appropriate to describe them as a group, e.g. in theircombined effect, as opposed to requiring character by characterdescriptions even if that's not appropriate.

In particular, some authors with Han / Kanji names have asked thattheir names be spelled with Latin characters, other have asked fortheir names to only be spelled with Han / Kanji, and yet others wantboth (often with the Latin of their family name in all caps). Theseare preferences that I think should be acknowledged and honored whensensible, even if bugs some other people.
In general, I agree. Only using Latin should of course be possible.Only using Han/Kanji (or any other non-Latin script) I think is a bigdisservice to the reader, and I'm glad that our current document, asfar as I understand it, disallows this. As for putting the family namein all caps, I think that's a style issue that should be left to the RPC.
So you're only looking for a change to the first two sentences to saythat all authors, even those who might write their names with non-ASCIIcharacters in other circumstances, can choose to give their names inonly ASCII characters in an RFC if that is their preference, and if theychoose to use non-ASCII characters, they need to provide an ASCIIinterpretation of their name.

My understanding is that the current -05 draft, with "These authors cangive their names using only ASCII characters, or as Unicode charactersand an ASCII interpretation of their name." already includes that.

My main request is to change ASCII to Latin script. The text I'mproposing is:"These authors can give their names using only Latin script characters,or as non-Latin script and a Latin-script equivalent of their name."

I prefer "equivalent" to "interpretation", because for me"interpretation" invokes something like "oh, the spelling of this namesuggest the author's ancestors may have been of French origin, mostpossibly from the nobility". Equivalence just means that it's the same,in some way (see https://en.wikipedia.org/wiki/Equivalence_relation). Inour case, it's the same if it denotes the same person.



Regards,    Martin.

--
rswg mailing list -- [email protected]
To unsubscribe send an email to [email protected]

[Rswg] Re: [Ext] Last call comments on draft-rswg-rfc7997bis

Reply via email to