Victor Mote wrote:
I know of at least two line-breaking strategies that we probably want to
have in our stock strategies: 1) the line-by-line method used right now, and
2) a Tex-like paragraph-oriented strategy, which AFAIK doesn't exist yet.

Ahem, that's not what I meant, or the scope of UTR14. UTR14 provides for "line break opportunities", for example you can break foo-bar after the hyphen but not 789-123. Which opportunities are used is another matter. FOP's current algorithm for determining line break opportunities is utterly simplistic, basically "possibly break before any breaking space, or after a hyphen or slash", the latter is done if hyphenation is enabled.

I omitted the forced line break issue, which is also in the UTR14 scope,
and hyphenation, which may lead to additional line break opportunities
but is outside of the UTR14 scope.

In your URL example, couldn't FOP see the "x-url" language & automatically
add or assume the glue characters for the user? That would perhaps make it
less obtrusive (I assume that you meant for the user).

Well, yes.


I don't see it there yet, but I am a little confused. It seems to me that
line-breaking consists of at least these components: 1) character-based
line-breaking opportunities (which UTR14 addresses), 2) word-based
line-breaking opportunities (which hyphenation dictionaries and patterns
address), and 3) some strategy for using these to find acceptable/optimal
line breaks. It sounds like you have addressed at least 1 and 3 in your
implementation.

Paragraph filling (your point 3) is not addressed. Be careful with the various TRs: UTR14 does not deal with character (rather: grapheme) or word boundaries, that's UTX-29. Actually, we don't use the latter. Our line breaking should probably be done the following way (this implements the "naive" paragraph filling strategy) loop calculate line width if next character is added check for a line breaking opportunity before the next character if there is an opportunity if the line is not full discard the last saved opportunity and save this else try hyphenation on the string accumulated since the last break opportunity (if enabled), save returned opportunity if any return saved line breaking opportunity end if end if end loop

hyphenation of a string:
 loop
   skip non-word characters (for this hyphenator)
   word = continuous run of word characters (for this hyphenator)
   if the end of the word is past the end of the line
     try hyphenating the word, generate new break opportunities
     return best fitting line break opportunity or null
   end if
 end loop

There is the degenerate case if the line overflows and no line break
opportunity is discovered at all.
The TeX paragraph filling strategy has to detect line break opportunities
the same way but selects the opportunities turning into actual line breaks
in a more clever way. We could do that too.

This seems at least remotely related to fo.FOText.isWordChar(), which
attempts to find breaks between words.

Actually, we don't need breaks between words. We need identifying line breaking opportunities, words for the purpose of hyphenation, and resizable spaces for justification. That's why WordArea was such a bad name.

J.Pietschmann



Reply via email to