On Mar 18, 2008, at 18:39, Jeff Hooker wrote:

Hi

I'm dealing with an issue that appears fairly frequently on the FOP and DocbookXSL mailing lists, but my version of it appears to be a bit of a
hybrid.

I've got OFFO hyphenation up and running and it works just fine in
easily 90% of my cases, but I've a special problem with table entry
values that are basically impossible to hyphenate (e.g.
XREF2TTMQ8_QRS20) and are contained in columns that require them to
break at least once. They overflow the cell and overwrite the contents
of the next cell. I'm not sure if the answer lies in hyphenation or line
breaking.

The answer would lie in implementing wrap-option="wrap" properly in FOP.
As I recall, it was once added to 0.9x, but got broken at some point (but I could be wrong).

Hyphenation is not really the answer, since --strictly theoretically-- hyphenation is mentioned only in the context of "words". Hyphenating a date, a number or some product code for instance, may be possible, but it doesn't make much sense IMO. Using hyphenation would be more a workaround/hack than a decent solution.

Currently, FOP's hyphenator has problems with anything non-alfabetic (unless you use a customized pattern file). Once the hyphenator encounters anything that is not a letter, it simply gives up, and the portion of text will be rendered as-is (no breaks).


I've got a couple ideas for how to approach this, and would appreciate
the input of anyone who had experience with them.

1. I've noticed that values will break if they contain a backslash (/).
If I could extend that function to include underscores, colons, and
brackets, that would address the vast majority of my corner cases.

That's due to the implementation of Unicode UAX#14 line-breaking, where the slash and backslash are special characters that offer line- break opportunities.

I seem to remember vaguely that Manuel mentioned that someone could alter the tables and recompile FOP to customize the behavior. OTOH, I don't think this would be a very robust way of going about it. Better to rely on the Unicode standard, and see if the exceptions you need can be covered in another fashion.

2. I've notices that the values that refuse to break invariably contain
numbers; I suspect that this is what's throwing the hyphenation
algorithm for a loop, but I can't find any guidance on adding numbers to
the classes in the hyphenation configuration file and just telling the
system to break the word whereever the heck it wants to.

See above for the explanation.
No idea if it is even possible to add numbers to the hyphenation pattern file, since numbers are also used in the pattern file to specify the preference/desirability of a hyphenation point...

3. Many have suggested that one should use zero-width spaces. I can't
ask my writers to do this on an as-needed basis, since the data is
published in many different scales and formats, but I'd be fine with
adding a zero-width space between every single character of a table
entry block and letting the lines break where they may. Has anyone
written the XSL for this already, or am I going to be the first?


There should be solutions available. I very much doubt you're the first to try this...

Cheers

Andreas

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to