J.Pietschmann wrote:

> >   Hyphenation problem in Bug 23985
> Actually, implementing UTR14 would solve the line breaking problem,
> although not the URL breaking problem.
> Points to discuss:


> - Should we provide for custom line breaking algorithms?
>   Some languages/scripts like Thai almost certainly require augmenting
>   any stock line breaking algorithms. However, the problem seems to
>   be more clever breaking of non-natural-languaage stuff, like URL.
>   We can leave this completely to the FO creators, forcing them for
>   example
>    + use language="x-url" to turn off hyphenation locally
>    + use glue characters line NBZWS to keep the stock line breaking
>     algorithm to break after slashes
>   The latter is quite intrusive.

IMO, yes, we should allow for custom line-breaking, although it somewhat
depends on what level you are thinking. IIRC, this is the example used for
the GoF Strategy pattern. Now, we have now implemented in a simplistic (and,
so far, not very useful) way, the layout strategy concept. Any given layout
strategy can control how its line-breaking works. It could conceivably use
one of several "stock" strategies available, its own proprietary method, or
even allow the user to choose. In general, I hope that proprietary methods
can/will be extracted to stock strategies for others to use, but I suppose
that may not always be feasible.

I know of at least two line-breaking strategies that we probably want to
have in our stock strategies: 1) the line-by-line method used right now, and
2) a Tex-like paragraph-oriented strategy, which AFAIK doesn't exist yet.

In your URL example, couldn't FOP see the "x-url" language & automatically
add or assume the glue characters for the user? That would perhaps make it
less obtrusive (I assume that you meant for the user).

> I've got my own UTR14 implementation (simplified, of course), which
> should appear on http://cvs.apache.org/~pietsch later this evening
> for review. It uses a LineBreakStatus object for tracking the status,
> which might be folded into the LayoutContext or a subclass used for
> inline FOs and text.
> Comments?

I don't see it there yet, but I am a little confused. It seems to me that
line-breaking consists of at least these components: 1) character-based
line-breaking opportunities (which UTR14 addresses), 2) word-based
line-breaking opportunities (which hyphenation dictionaries and patterns
address), and 3) some strategy for using these to find acceptable/optimal
line breaks. It sounds like you have addressed at least 1 and 3 in your
implementation. If the part related to item 1 is factored out for use/reuse,
that sure seems valuable. Then the part related to item 3 becomes (perhaps)
one of the line-breaking strategies available to layout strategies? Or maybe
I have underestimated the scope of UTR14?

This seems at least remotely related to fo.FOText.isWordChar(), which
attempts to find breaks between words.

Victor Mote

Reply via email to