Priority: P2
            Bug ID: 54081
           Summary: Properly tag hyphenated words
          Severity: normal
    Classification: Unclassified
                OS: All
          Hardware: All
            Status: NEW
           Version: all
         Component: pdf
           Product: Fop

If a hyphenated word is stored as-is in the PDF output, a screen reader will
read it differently to when it is not hyphenated. This can result into
incomprehensible text.

To fix that problem, a hyphenated word should properly be tagged as such. This
can be done in 2 ways.

The first possibility is to add an 'ActualText' entry to the property list of
the corresponding marked-content sequence. Its value would basically be the
whole text minus the last hyphen character.

The second possibility is to replace the last hyphen with a soft hyphen
character, which will be recognized by screen readers such that the split word
will be read as one. This will work only if the font has a glyph for the soft
hyphen character.

The latter possibility is the recommended way to handle hyphenated words. The
former can be implemented as a fallback for when there is no available glyph
for the soft hyphen, or when the hyphenation character is not actually a hyphen
(this can be customized through the hyphenation-character property).

You are receiving this mail because:
You are the assignee for the bug.

Reply via email to