In a program I'm writing, I need to add line breaks to a paragraph of text that is displayed on a terminal. u8_width_linebreaks is very helpful in this regard but I have a couple of questions.
In my code, I initially just iterated through looking for where the current element is set to something other than UC_BREAK_PROHIBITED and insert a newline at that point. The result of this is that my output contains trailing spaces which is not what I want. It also appears that that one position is being reserved for that space when considering the text width. I can't just remove the last character before the break because sometimes it is a hyphen or some other property that has allowed the line break to be there. What is the best way to handle these issues? Is there a property corresponding to characters that should be elided when adding a line break. What I have at the moment all seems rather clumsy, I'm going back with u8_prev until uc_is_property_grapheme_base and then checking uc_is_property_space. Is there a better way to iterate through UTF-8 over full graphemes? What'd perhaps make it much easier would be if there was a flag (UC_BREAK_ELIDE perhaps) which could be set on any characters that need to be removed when wrapping (along with their associated combining characters). Another point of note is that the documentation for u8_width_linebreaks claims that it chooses "the best line breaks". It actually appears to use a simple greedy algorithm so it does use as few lines as possible but that isn't really what I'd call the "best line breaks". The output is often a lot nicer if differences in line lengths are reduced. In this regard, the best algorithm would be NP-complete but it is possible to do better than the greedy algorithm efficiently. You may have come across par which is similar to fmt but which does just that. Thanks Oliver
