Manuel Mall wrote:
While investigating if we could use the standard java.text.BreakIterator to determine line break points I noticed that FOP uses in addition to space, zero width space, hyphen also the forward slash as a valid line breaking character. The Java BreakIterator does not recognize slash as a line breaking char (nor FWIW does MS Word).

What is the background to FOP allowing this? Is this consistent with normal user expectations or is this specific to type setting environments / Tex / Knuth?

The BreakIterator class is supposed to implement the Unicode TR14
standard annex
The slash U+002F aka SOLIDUS is assigned a line breaking property
value SY (Symbols Allowing Breaks)
which means "prevent a break before, and allow a break after". I suspect
this is a recent change in Unicode, not implemented yet by your JDK
BTW first breaking the text using whitespace, then applying the
BreakIterator is unwise, because white space is significant for TR14
line breaking. Unfortunately, combining whitespace normalization, line
break detection and word parsing (for hyphenation) in a single pass is
unwieldy if BreakIterator is used, that's why I tried to implement it
differently some time ago


Reply via email to