On 21/08/2003 13:26, Jim Allan wrote:

Traditionally in c NBSP was not counted as white space. See http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vccelng/htm/eleme_2.asp for one reference.

This may have been accidental, as c white space properties were defined with only the 7-bit ASCII character set in mind.

But it would break current c programs if NBSP were defined as white space. Logically then, if we exclude NBSP, other "hard" spaces should also not be defined as white space.

Essentially NBSP was treated by many word processors and text editors as simply a printing character, like any other printing character, with no special "spacing" properties. It was only an imitation of a space in appearance. Undefined characters in fonts might also appear as imitiations of space in many printing systems. That did not make them white space.

Of course under Unicode specifications NBSP is expect to expand like SPACE for justification and so assumes some of the attributes of SPACE.

For compatility I think it best to not include any of the non-breaking spaces as white space.

Jim Allan

Not counting NBSP as whitespace may make it easier to include spacing diacritics in patterns, if NBSP rather than space is used to to carry them.

--
Peter Kirk
[EMAIL PROTECTED] (personal)
[EMAIL PROTECTED] (work)
http://www.qaya.org/





Reply via email to