2014-02-05 18:22, Markus Scherer wrote:

On Tue, Feb 4, 2014 at 2:25 PM, Rhavin Grobert <[email protected]
<mailto:[email protected]>> wrote:

    Parallel to soft hyphen, a hyphen that is just inserted if the word
    was broken, it would be practical to have some way to tell browser:
    if you need to break the line, try here first. This would be really
    usefull for poems, music lines, adresses,…


That would be HTML <wbr> <http://dev.w3.org/html5/markup/wbr.html> or
U+200B ZERO WIDTH SPACE

As a suggested direct line break point, they both work fine, with few caveats though, making it a bit difficult to decide which one is better, see my treatise
http://www.cs.tut.fi/~jkorpela/html/nobr.html#suggest

In plain text, of course, U+200B is the way. The main problem with it is that some software, including some old browsers like IE 6, do not recognize it but try to render it as a graphic character, possibly using a font that has no glyph for it. Adding a new character would not help here at all, of course.

    And it would be really easy to implement: there is no visual
    representation needed and if the right code-point is choosen, it
    would be downward-compatible to all systems not knowing of the new
    character.

Unlikely.

Indeed, there is no reason to expect old software to silently ignore characters that they do not recognize. Whatever the Unicode Standard might say, old software just does what it has been programmed to do, and this may well be “here’s a character for which I have no special rule, so I’ll use whatever is available in the font(s) I’m using”, typically resulting in a small rectangle that represents a character for which no glyph is available.

But I’m not quite sure of the idea of the suggestion. If the idea is to provide an optional break point, in a position where none would normally not be present, then U+200B is the way. Not 100% reliable, but better than anything else (in plain text).

But if the idea is to suggest that among permissible line break points, this one is preferable, then it’s a different issue. Theoretically interesting, but in practical terms, things don’t work that way. In practice, there are permissible line break points (either by implicit rules that e.g. normally allow a break after a space, or by explicit indication by U+200B). Programs will take it from there, and if they do some optimization, like good publishing software does, they typically optimize the division of an entire paragraph into lines, applying several criteria.

Yucca



_______________________________________________
Unicode mailing list
[email protected]
http://unicode.org/mailman/listinfo/unicode

Reply via email to