Andreas L. Delmelle created FOP-2466:

             Summary: Improve output for pre-hyphenated text with SHY combined 
with hyphenation properties
                 Key: FOP-2466
             Project: Fop
          Issue Type: Improvement
          Components: layout/line
    Affects Versions: 1.1
            Reporter: Andreas L. Delmelle
            Priority: Minor

When processing a FO file that contains pre-hyphenated text, using 
soft-hyphens, FOP's hyphenation does not yield usable results.

>From the corresponding thread on fop-users@:

... internally for FOP, [t]he accumulated sequence of characters since the 
previous break opportunity is taken to be a 'word', which may or may not end in 
a hyphen. If the latter is true, a specific sequence of elements is glued to 
the word-box, to prevent a break before SHY and make sure that it is properly 
rendered, i.e. only counts if the break occurs right after.

As hyphenation by FOP itself is applied at a higher level, when all layout 
elements for a whole paragraph have been collected, that SHY sequence is seen 
as a word boundary. That is, that part of the algorithm just accumulates the 
text for ‘uninterrupted' sequences of word-boxes, and feeds those pieces to the 
hyphenator. The real intention is to apply hyphenation across any nested 
fo:inlines. ‘Uninterrupted’ means that auxiliary elements, generated for border 
or padding are explicitly *not* considered as word boundaries. The sequence 
generated for SHY contains two non-auxiliary elements, as if it were a space. 
Perhaps, just to ensure that that position in the layout always leads to a 
character that is visibly rendered.

In case of pre-hyphenated text, this has the unintended effect of restricting 
the input for the hyphenator to parts of words, which is basically meaningless 
(and wasteful).

Amongst others, this leads to the "hyphenation-ladder-count" property having 
seemingly no effect.

Note - At this point, I believe the behaviour is not necessarily incorrect. I 
am also thinking that it would be correct to ignore hyphenation-ladder-count in 
case hyphenation="false".

Initial idea for a fix: 
Make sure that the SHY sequence is not treated as a word boundary in LineLM 
when accumulating text for boxes generated by the TextLMs. Once done, we should 
then be able to check for each hyphenation point that FOP itself calculates, 
whether there is already an explicit SHY present at that same point. In that 
case, we can just do nothing (= leave the SHY in place).

This message was sent by Atlassian JIRA

Reply via email to