On Monday, 20 January 2025 at 00:19:44 UTC, Richard (Rikki) Andrew Cattermole wrote:
On 24/11/2024 9:53 AM, cookiewitch wrote:
I… don't know? My idea of a "word" here is any unbreakable unit. I guess I have another round of looking up Unicode algorithms ahead of me.

Unfortunately word breaking isn't a simple algorithm to implement.

https://www.unicode.org/reports/tr29/#Word_Boundaries

White space and punctuation alone cannot differentiate words.

Nor can it be used to fake identifiers in a programming language.

Additionally many words are actually break-
-able because you can hyphenate them to wrap
over multiple lines neatly. Books and news-
-papers employ this technique more
frequently than most other mediums due
to limited page space and printing costs.

However, when space isn’t physically limited (and a book or newspaper are not being emulated), breaking up words is unfavourable because it compromises readability. So it’s nice to have a word breaking option (for when emulating printed media is desirable, like rendering a newspaper texture that has randomly generated text), but it should by no means be the default. And it’s a lot of work for such a subtle and language-specific feature.
              • ... IchorDev via Digitalmars-d-announce
              • ... cookiewitch via Digitalmars-d-announce
              • ... IchorDev via Digitalmars-d-announce
              • ... cookiewitch via Digitalmars-d-announce
              • ... IchorDev via Digitalmars-d-announce
              • ... claptrap via Digitalmars-d-announce
              • ... cookiewitch via Digitalmars-d-announce
              • ... IchorDev via Digitalmars-d-announce
              • ... cookiewitch via Digitalmars-d-announce
              • ... Richard (Rikki) Andrew Cattermole via Digitalmars-d-announce
              • ... IchorDev via Digitalmars-d-announce
  • Re: Fluid 0.7... cookiewitch via Digitalmars-d-announce

Reply via email to