Re: Is non-breaking space a space?

2003-12-02 Thread Erik Trulsson
On Tue, Dec 02, 2003 at 09:18:20AM +0100, Jean-Baptiste Quenot wrote:
 * Erik Trulsson:
 
  On Tue, Dec 02, 2003 at 01:31:07AM +0100, Jean-Baptiste Quenot wrote:
 
   In /usr/src/share/mklocale, the file la_LN.ISO8859-1.src for example
   contains a  SPACE definition  that includes the  non-breaking space.
   It seems that it is so since  the beginning of FreeBSD, but is there
   some reference, some standard that states whether NBSP is considered
   a space or not?
 
  Ifyoulookatthelocaledefinitionsfoundat
  http://www.dkuug.dk/JTC1/SC22/WG15 it  would seem that NBSP  should be
  considered  as  a space  character,  but  there  might be  some  other
  standard somewhere else that says differently.
 
 That's  also   my  opinion.   Let's   explain  the  whole   story:   I'm
 reformatting my  email messages with  textproc/par, and I  noticed since
 I'm using FreeBSD  that all non-breaking spaces are  converted to spaces
 during filtering,  just because isspace(160)  is true.  Of course,  if I
 put non-breaking  spaces in my text,  I'm not expecting the  lines to be
 broken on them, and I don't want  them to be filtered out, because nbsps
 make sense when used appropriately.
 
 After a while,  I discovered that the issue is  related to locales.  And
 IMHO it  makes sense  not to consider  nbsp as a  space.  Where  shall I
 report the problem?

I would say that is a problem with the tool you are using, in that it
does not seem to be aware of the existence of non-breaking spaces, or
treat them specially.

I think that NBSP should be considered as a space (if nothing else the
very name non-breaking space implies that it is a space, albeit a not
a normal space), but it should not be considered as a word-separator.
Unfortunately many programs (and many standards for that matter) assume
that all types of whitespace are word-separators as well, which they
probably shouldn't do.


-- 
Insert your favourite quote here.
Erik Trulsson
[EMAIL PROTECTED]
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Is non-breaking space a space?

2003-12-01 Thread Jean-Baptiste Quenot
Hello,

I'm wondering why the non-breaking space is considered as a space in the
FreeBSD  C library,  whereas  it is  not  in the  GNU  libc.  Sorry  for
comparing the two,  but as a result, Linux and  FreeBSD are incompatible
in the way  they handle isspace(160).  This *only*  occurs when LC_CTYPE
is given « single C chars locales » like en_US.ISO8859-1.

In  /usr/src/share/mklocale, the  file  la_LN.ISO8859-1.src for  example
contains a  SPACE definition that  includes the non-breaking  space.  It
seems that it  is so since the  beginning of FreeBSD, but  is there some
reference, some standard that states  whether NBSP is considered a space
or not?

BTW  the  « official » [1]sources  for  glibc  ctype functions  have  an
interesting comment:

static bool
is_space (unsigned int ch)
{
  /* Don't make U+00A0 a space. Non-breaking space means that all programs
 should treat it like a punctuation character, not like a space. */

Best regards,
-- 
Jean-Baptiste Quenot
http://caraldi.com/jbq/

[1] 
http://sources.redhat.com/cgi-bin/cvsweb.cgi/libc/localedata/gen-unicode-ctype.c?rev=1.4content-type=text/x-cvsweb-markupcvsroot=glibc


pgp0.pgp
Description: PGP signature