Andre Couture wrote:

> Hi
> I did not follow the entire conversation here but I was curious as of why
> would someone put a non breaking space between two words?
> We face that in other areas of our code as well.
>
> If the idea of the nbsp is to keep the two apparent words together, would
> it be good to handle the nbsp as an hyphen? Which mean that the two
> words could be treated as two words or a single one??

No. The nbsp is to avoid a newline where it would not be suitable.
A good example is when you write "80 kg".  It would be ugly if
80 is at the end of the line and kg at the beginning of the next line.
So using a nbsp between numbers and units is useful.  But for
LanguageTool, this is irrelevant (i.e. it should be like a space).

There are other good examples of nbsp in English here:
  
http://english.stackexchange.com/questions/28467/when-is-it-appropriate-to-use-non-breaking-spaces

There are plenty of space characters in Unicode. See:
https://en.wikipedia.org/wiki/Whitespace_character

The tokenization should ideally treat them equal I think but
I have not checked.

In French at least, handling "U+202F NARROW NO-BREAK SPACE"
correctly (i.e. as a space for LT) would be useful. This is the recommended
space to use in front of punctuation ? ! ; :  I know, English and other language
don't put a space before those punctuation characters, but French does,
it would be ugly if the punctuation character was on the next line.
  https://fr.wikipedia.org/wiki/Espace_fine_ins%C3%A9cable

Regards
Dominique

------------------------------------------------------------------------------
_______________________________________________
Languagetool-devel mailing list
Languagetool-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/languagetool-devel

Reply via email to