From: Unicode <[email protected]> On Behalf Of Jukka K. Korpela
via Unicode
Sent: Friday, January 16, 2026 8:46 PM
…
> Generally, whether a character is closing, final, initial, or opening
> punctation should be based on language-specific
> information, such as CLDR.
I would advice against that, since 1) language information is not always
available, 2) even when available, it is not reliable,
3) even when available and correct, people often use their primary language’s
quotation convention, even for there second/third/… language…
For quotation marks, it is an unfortunate historical accident that different
typographic traditions (not languages really) have
different conventions.
For “ambiguous” quote marks (and for that matter apostrophes also when not used
as quotation marks) and line breaking
I have proposed an update to the Unicode line breaking rules (not
language/typographic tradition dependent) in
https://www.unicode.org/L2/L2025/25261r-line-breaking.pdf.
That should take care of the line breaking issue (very annoying at present) for
“ambiguous” quote marks.
When it comes to the bidi issue with these marks, I note that other brackets
now seem to be treated specially (I
haven’t yet checked the latest issue of the bidi algorithm), at least there is
a new data file: https://www.unicode.org/Public/UNIDATA/BidiBrackets.txt. But
“ambiguous” quote marks are not handled. One would
still need some bidi control characters (like RLM, LRM) to fix the issue. But
people will not generally be so knowledgeable,
as well as meticulous, to input them. So I would suggest to add to bidi
processing that