Re: NNBSP (was: A last missing link for interoperable representation)

Marcel Schneider via Unicode Thu, 17 Jan 2019 05:59:59 -0800

On 17/01/2019 14:36, I wrote:

[…]
The only thing that searches have brought up


It was actually the best thing. Here’s an even more surprising hit:

               B. In the rules, allow these characters to bridge both 
alphabetic and numeric words, with:

                 * Replace MidLetter by (MidLetter | MidNumLet)
                 * Replace MidNum by (MidNum | MidNumLet)


               -------------------------

               4. In addition, the following are also sometimes used, or could 
be used, as numeric separators (we don't give much guidance as to the best 
choice in the standard):

               |0020 <http://unicode.org/cldr/utility/character.jsp?a=0020>|( ) 
SPACE
               |00A0 <http://unicode.org/cldr/utility/character.jsp?a=00A0>|(   
) NO-BREAK SPACE
               |2007 <http://unicode.org/cldr/utility/character.jsp?a=2007>|(   
) FIGURE SPACE
               |2008 <http://unicode.org/cldr/utility/character.jsp?a=2008>|(   
) PUNCTUATION SPACE
               |2009 <http://unicode.org/cldr/utility/character.jsp?a=2009>|(   
) THIN SPACE
               |202F <http://unicode.org/cldr/utility/character.jsp?a=202F>|(   
) NARROW NO-BREAK SPACE

               If we had good reason to believe that if one of these only 
really occurred between digits in a single number, then we could add it. I 
don't have enough information to feel like a proposal for that is warranted, 
but others may. Short of that, we should at least document in the notes that 
some implementations may want to tailor MidNum to add some of these.


I fail to understand what hack is going on. Why didn’t Unicode wish to sort out 
which one of these is the group separator?

1. SPACE: is breakable, hence exit.
2. NO-BREAK SPACE: is justifying, hence exit.
3. FIGURE SPACE: has the full width of a digit, too wide, hence exit.
4. PUNCTUATION SPACE: has been left breakable against all reason and evidence 
and consistency, hence exit…
5. THIN SPACE: is part of the breakable spaces series, hence exit.
6. NARROW NO-BREAK SPACE: is okay.

CLDR has been OK to fix this for French for release 34. At present survey 35 
all is questioned again, must be assessed, may impact implementations, while 
all other locales using space are still impacted by bad display using NO-BREAK 
SPACE.

I know we have another public Mail List for that, but I feel it’s important to 
submit this to a larger community for consideration and eventually, for 
feedback.

Thanks.

Regards,

Marcel

P.S. For completeness:

http://unicode.org/L2/L2007/07370-punct.html

And also wrt my previous post:

https://www.unicode.org/L2/L2007/07209-whistler-uax14.txt

Re: NNBSP (was: A last missing link for interoperable representation)

Reply via email to