----- Original Message ----- From: "Ernest Cline" <[EMAIL PROTECTED]> To: "Philippe Verdy" <[EMAIL PROTECTED]> Cc: "Unicode Mailing List" <[EMAIL PROTECTED]> Sent: Sunday, April 04, 2004 4:30 AM Subject: Re: Fixed Width Spaces
> > > > > [Original Message] > > From: Philippe Verdy <[EMAIL PROTECTED]> > > > > > There is at least one instance where NBSP had best be treated > > > as a fixed width space, when it is used as thousands separator as in > > > 100 000. Unicode recognizes it for this use by assigning NBSP the > > > Bidi Class of CS. I doubt if anyone is going to seriously argue that > the > > > space between 100 and 000 should be expanded upon justification. > > > Of course, that could be taken care of by adding NBSP to Boundary > > > class MidNum in the Text Boundaries document (UAX#29) without > > > affecting its nature when used elsewhere. > > > > Isn't that the role of the FIGURE SPACE, or better, of the THIN SPACE ? > > FIGURE SPACE main function is a place holder so that the lining up > of numeric data can be done easily is proportional plain text. > > PUNCTUATION SPACE can serve the same function for commas > and decimals that aren't present in some of the figures but not in all, > but it might also be appropriate for job of thousands separator in > general. > > THIN SPACE also might be appropriate for the job of thousands > separator. > > However, as far as the Bidirectional Algorithm is concerned, > NBSP is the one and only space that it recognizes as linking > adjacent groups of digits into a single number. Yes but NBSP cannot be used in most books or in some legal accounting documents, due to its too large minimum width which allows a digit to be inserted. In France, for some legal documents, grouping digits can be done with a space, but the width of that space must be thin enough to not allow inserting any additional digit in the middle. This is important for bank checks and money orders for example, and as well this thin space must not be breakable. The FIGURE SPACE was defined to be uncompressible and normally unbreakable from the surrounding digits. It's a good candidate except that it exactly allows digits to be inserted after printing a final document, and thus is not appropriate in some legal publications due to its width. So THIN SPACE is generally used (possibly in association with ZWJ), even if Unicode has more recently introduced the Narrow Non-breaking space, which works better than THIN SPACE and has the necessary properties (should not be expansed independantely from digits during line justification, but it can be contracted if needed to make a number fit within a line. Many legacy i18n libraries use NBSP for decimal grouping, some even use SPACE (but with lots of problems due to unexpected line breaks); NNBSP is certainly the best candidate for now to override the NBSP, if it's supported (the THIN SPACE is generally supported in almost all good layout&composition engines used by publishers and the printed press). For plain-text documents, NBSP is almost always used, and the transformation of NBSP to narrower spaces is part of the typesetting job, and is generally performed with automated tools and converters used by publishers. Most of the time, we don't need even THIN SPACE, and narrow NBSP, etc... in plain texts, simply because SPACE and NBSP are just enough to transport the semantic of the text and the possibility of line breaks. So all these extra spaces are, in my opinion, part of the rich-text options for typesetting, just because it is simpler to encode them within the plain-text instead of inserting verbose meta-tags. But I do know that ALMOST ALL newspapers REQUIRE that their typists to enter thin spaces appropriately: i.e. as digit grouping separators (thousands, phone numbers, prices, various legal or bank identifiers with fixed formats), or as extensions of a basic punctuation.

