[ 
https://issues.apache.org/jira/browse/FOP-2522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17807729#comment-17807729
 ] 

Björn Kautler edited comment on FOP-2522 at 1/17/24 1:49 PM:
-------------------------------------------------------------

Probably because that is exactly what a soft hyphen is for?

[https://en.wikipedia.org/wiki/Soft_hyphen]
{quote}In computing and typesetting, a *soft hyphen* [...] or *syllable hyphen* 
[...], abbreviated {*}SHY{*}, is a code point reserved in some [coded character 
sets|https://en.wikipedia.org/wiki/Coded_character_set] for the purpose of 
breaking words across lines by inserting visible 
[hyphens|https://en.wikipedia.org/wiki/Hyphen] if they fall on the line end but 
remain invisible within the line
{quote}
So it should never be shown within a line, but when needed at that spot the 
text can be split at the line-end and the hyphenation character shown.

With a ZWSP it would probably break, but not show the hyphenation character.

Also with a ZWSP other tools would consider left and right part two words, 
while with "SHY" it really is one word with markers where the word could be 
split at line-end.

If you add a bit more to the test like a very long line of "asdf" followed by 
"SHY" repeated wihtout other spaces, so {{asdf<SHY>asdf<SHY>asdf...}} and then 
duplicate the {{fo:block}} and in the second set the {{hyphenation-character}} 
attribute to value {{/}} mainly for demonstration purpose, this is the intended 
outcome that is also produced with my patch:

!image-2024-01-17-14-48-50-945.png!

while without my patch the actual and undesired outcome is:

!image-2024-01-17-14-49-08-309.png!

So yes, I think it is the right to always suppress it like I did, as it should 
be never correct to show a hyphen-minus in place of a SHY which is only a "you 
can break the word here" marker.


was (Author: vampire):
Probably because that is exactly what a soft hyphen is for?

[https://en.wikipedia.org/wiki/Soft_hyphen]
{quote}In computing and typesetting, a *soft hyphen* [...] or *syllable hyphen* 
[...], abbreviated {*}SHY{*}, is a code point reserved in some [coded character 
sets|https://en.wikipedia.org/wiki/Coded_character_set] for the purpose of 
breaking words across lines by inserting visible 
[hyphens|https://en.wikipedia.org/wiki/Hyphen] if they fall on the line end but 
remain invisible within the line
{quote}
So it should never be shown within a line, but when needed at that spot the 
text can be split at the line-end and the hyphenation character shown.

With a ZWSP it would probably break, but not show the hyphenation character.

Also with a ZWSP other tools would consider left and right part two words, 
while with "SHY" it really is one word with markers where the word could be 
split at line-end.

If you add a bit more to the test like a very long line of "asdf" followed by 
"SHY" repeated wihtout other spaces, so {{asdf<SHY>asdf<SHY>asdf...}} and then 
duplicate the {{fo:block}} and in the second set the {{hyphenation-character}} 
attribute to value {{/}} mainly for demonstration purpose, this is the intended 
outcome that is also produced with my patch:

!image-2024-01-17-14-42-53-377.png!

while without my patch the actual and undesired outcome is:

!image-2024-01-17-14-46-46-565.png!

So yes, I think it is the right to always suppress it like I did, as it should 
be never correct to show a hyphen-minus in place of a SHY which is only a "you 
can break the word here" marker.

> [PATCH] Soft hyphens in front of some characters are transformed to 
> hyphen-minus
> --------------------------------------------------------------------------------
>
>                 Key: FOP-2522
>                 URL: https://issues.apache.org/jira/browse/FOP-2522
>             Project: FOP
>          Issue Type: Bug
>    Affects Versions: 2.0
>            Reporter: Björn Kautler
>            Priority: Major
>         Attachments: image-2024-01-17-14-48-50-945.png, 
> image-2024-01-17-14-49-08-309.png, issue-2522.fo, issue-2522.patch
>
>
> If you have a verbatim block like {{<programlisting>}} in DocBook, the 
> DocBook XSL stylesheets insert many soft hypens 
> (http://decodeunicode.org/u+00AD) into the content to show where the 
> FO-processor may insert linebreaks. By default after spaces and non-breakable 
> spaces, but configurable also after arbitrary other characters.
> Unfortunately it seems FOP does not handle the soft hyphens correctly, 
> depending on the character that follows it. Soft Hyphens in front of some 
> characters are transformed to hyphen-minus, no matter what 
> hyphenation-characters is configured and even if the occurence is within a 
> line and not at line break.
> I've observed this behaviour with soft hyphens in front of apostrophe 
> (http://decodeunicode.org/u+0027), quotation mark 
> (http://decodeunicode.org/u+0022), hyphen-minus 
> (http://decodeunicode.org/u+002D) and full stop 
> (http://decodeunicode.org/u+002E)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to