[bug-libunistring] trailing spaces when wrapping text with u8_width_linebreaks

Oliver Kiddle Mon, 16 Nov 2015 16:32:02 -0800

In a program I'm writing, I need to add line breaks to a paragraph
of text that is displayed on a terminal. u8_width_linebreaks is very
helpful in this regard but I have a couple of questions.


In my code, I initially just iterated through looking for where the
current element is set to something other than UC_BREAK_PROHIBITED and
insert a newline at that point. The result of this is that my output
contains trailing spaces which is not what I want. It also appears that
that one position is being reserved for that space when considering the
text width. I can't just remove the last character before the break
because sometimes it is a hyphen or some other property that has allowed
the line break to be there. What is the best way to handle these issues?
Is there a property corresponding to characters that should be elided
when adding a line break. What I have at the moment all seems rather
clumsy, I'm going back with u8_prev until uc_is_property_grapheme_base
and then checking uc_is_property_space. Is there a better way to iterate
through UTF-8 over full graphemes?

What'd perhaps make it much easier would be if there was a flag
(UC_BREAK_ELIDE perhaps) which could be set on any characters that need
to be removed when wrapping (along with their associated combining
characters).

Another point of note is that the documentation for u8_width_linebreaks
claims that it chooses "the best line breaks". It actually appears to
use a simple greedy algorithm so it does use as few lines as possible
but that isn't really what I'd call the "best line breaks". The output
is often a lot nicer if differences in line lengths are reduced. In this
regard, the best algorithm would be NP-complete but it is possible to do
better than the greedy algorithm efficiently. You may have come across
par which is similar to fmt but which does just that.

Thanks

Oliver

[bug-libunistring] trailing spaces when wrapping text with u8_width_linebreaks

Reply via email to