On 21/08/2003 13:26, Jim Allan wrote:
Traditionally in c NBSP was not counted as white space. See
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vccelng/htm/eleme_2.asp
for one reference.
This may have been accidental, as c white space properties were
defined with only the 7-bit ASCII character set in mind.
But it would break current c programs if NBSP were defined as white
space. Logically then, if we exclude NBSP, other "hard" spaces should
also not be defined as white space.
Essentially NBSP was treated by many word processors and text editors
as simply a printing character, like any other printing character,
with no special "spacing" properties. It was only an imitation of a
space in appearance. Undefined characters in fonts might also appear
as imitiations of space in many printing systems. That did not make
them white space.
Of course under Unicode specifications NBSP is expect to expand like
SPACE for justification and so assumes some of the attributes of SPACE.
For compatility I think it best to not include any of the non-breaking
spaces as white space.
Jim Allan
Not counting NBSP as whitespace may make it easier to include spacing
diacritics in patterns, if NBSP rather than space is used to to carry them.
--
Peter Kirk
[EMAIL PROTECTED] (personal)
[EMAIL PROTECTED] (work)
http://www.qaya.org/