Andreas L. Delmelle created FOP-2466:
----------------------------------------
Summary: Improve output for pre-hyphenated text with SHY combined
with hyphenation properties
Key: FOP-2466
URL: https://issues.apache.org/jira/browse/FOP-2466
Project: Fop
Issue Type: Improvement
Components: layout/line
Affects Versions: 1.1
Reporter: Andreas L. Delmelle
Priority: Minor
When processing a FO file that contains pre-hyphenated text, using
soft-hyphens, FOP's hyphenation does not yield usable results.
>From the corresponding thread on fop-users@:
... internally for FOP, [t]he accumulated sequence of characters since the
previous break opportunity is taken to be a 'word', which may or may not end in
a hyphen. If the latter is true, a specific sequence of elements is glued to
the word-box, to prevent a break before SHY and make sure that it is properly
rendered, i.e. only counts if the break occurs right after.
As hyphenation by FOP itself is applied at a higher level, when all layout
elements for a whole paragraph have been collected, that SHY sequence is seen
as a word boundary. That is, that part of the algorithm just accumulates the
text for ‘uninterrupted' sequences of word-boxes, and feeds those pieces to the
hyphenator. The real intention is to apply hyphenation across any nested
fo:inlines. ‘Uninterrupted’ means that auxiliary elements, generated for border
or padding are explicitly *not* considered as word boundaries. The sequence
generated for SHY contains two non-auxiliary elements, as if it were a space.
Perhaps, just to ensure that that position in the layout always leads to a
character that is visibly rendered.
In case of pre-hyphenated text, this has the unintended effect of restricting
the input for the hyphenator to parts of words, which is basically meaningless
(and wasteful).
Amongst others, this leads to the "hyphenation-ladder-count" property having
seemingly no effect.
Note - At this point, I believe the behaviour is not necessarily incorrect. I
am also thinking that it would be correct to ignore hyphenation-ladder-count in
case hyphenation="false".
Initial idea for a fix:
Make sure that the SHY sequence is not treated as a word boundary in LineLM
when accumulating text for boxes generated by the TextLMs. Once done, we should
then be able to check for each hyphenation point that FOP itself calculates,
whether there is already an explicit SHY present at that same point. In that
case, we can just do nothing (= leave the SHY in place).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)