J.Pietschmann wrote: > [EMAIL PROTECTED] wrote: > > Hyphenation problem in Bug 23985 > > Actually, implementing UTR14 would solve the line breaking problem, > although not the URL breaking problem. > > Points to discuss:
... > - Should we provide for custom line breaking algorithms? > Some languages/scripts like Thai almost certainly require augmenting > any stock line breaking algorithms. However, the problem seems to > be more clever breaking of non-natural-languaage stuff, like URL. > We can leave this completely to the FO creators, forcing them for > example > + use language="x-url" to turn off hyphenation locally > + use glue characters line NBZWS to keep the stock line breaking > algorithm to break after slashes > The latter is quite intrusive. IMO, yes, we should allow for custom line-breaking, although it somewhat depends on what level you are thinking. IIRC, this is the example used for the GoF Strategy pattern. Now, we have now implemented in a simplistic (and, so far, not very useful) way, the layout strategy concept. Any given layout strategy can control how its line-breaking works. It could conceivably use one of several "stock" strategies available, its own proprietary method, or even allow the user to choose. In general, I hope that proprietary methods can/will be extracted to stock strategies for others to use, but I suppose that may not always be feasible. I know of at least two line-breaking strategies that we probably want to have in our stock strategies: 1) the line-by-line method used right now, and 2) a Tex-like paragraph-oriented strategy, which AFAIK doesn't exist yet. In your URL example, couldn't FOP see the "x-url" language & automatically add or assume the glue characters for the user? That would perhaps make it less obtrusive (I assume that you meant for the user). > I've got my own UTR14 implementation (simplified, of course), which > should appear on http://cvs.apache.org/~pietsch later this evening > for review. It uses a LineBreakStatus object for tracking the status, > which might be folded into the LayoutContext or a subclass used for > inline FOs and text. > > Comments? I don't see it there yet, but I am a little confused. It seems to me that line-breaking consists of at least these components: 1) character-based line-breaking opportunities (which UTR14 addresses), 2) word-based line-breaking opportunities (which hyphenation dictionaries and patterns address), and 3) some strategy for using these to find acceptable/optimal line breaks. It sounds like you have addressed at least 1 and 3 in your implementation. If the part related to item 1 is factored out for use/reuse, that sure seems valuable. Then the part related to item 3 becomes (perhaps) one of the line-breaking strategies available to layout strategies? Or maybe I have underestimated the scope of UTR14? This seems at least remotely related to fo.FOText.isWordChar(), which attempts to find breaks between words. Victor Mote