[ 
https://issues.apache.org/jira/browse/FOP-2701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17544437#comment-17544437
 ] 

Martin Hönings commented on FOP-2701:
-------------------------------------

I enabled complex scripts, disabled="false" means enabled.

If I disable complex scripts, i.e. disabled="true", I will not get ligatures in 
the PDF. Please see the last paragraph of my comment from 10/May/22.

 

> Some of the latin ligatures make text not searchable in PDF
> -----------------------------------------------------------
>
>                 Key: FOP-2701
>                 URL: https://issues.apache.org/jira/browse/FOP-2701
>             Project: FOP
>          Issue Type: Bug
>          Components: font/opentype
>    Affects Versions: 2.1
>         Environment: Windows 10, Calibri font.
>            Reporter: Dan Caprioara
>            Priority: Major
>         Attachments: fop.xconf, latn-ligatures-Antenna-House.pdf, 
> latn-ligatures-FOP.pdf, out.pdf, test.fo
>
>
> This problem happens using the Calibri font, that is packed in the MS Office 
> suite and Windows 10.
> I tested with the following text: {{file settings}}. 
> The resulted PDF text contains ligatures: {{(fi)le se(tti)ngs}}
> Searching for {{file}} in Acrobat Reader results in the first word being 
> selected. This is Ok. But searching for {{set}}, or {{settings}} gives no 
> results. 
> The same example, run with Antenna House works fine, you get results when 
> searching for {{settings}}.
> Here is the complete FO file:
> {code:xml}
> <?xml version="1.0" encoding="UTF-8"?>
> <fo:root xmlns:fo="http://www.w3.org/1999/XSL/Format";>
>     <fo:layout-master-set>
>         <fo:simple-page-master master-name="a">
>             <fo:region-body/>
>         </fo:simple-page-master>
>     </fo:layout-master-set>
>     <fo:page-sequence master-reference="a">
>         <fo:flow flow-name="xsl-region-body">
>             <fo:block font-family="Calibri" font-size="40pt">file 
> settings</fo:block>
>         </fo:flow>
>     </fo:page-sequence>
> </fo:root>
> {code}
> Some considerations:
> # A workaround would be to reject all the substitutions that are not part of 
> org.apache.fop.fonts.type1.AdobeStandardEncoding. This would leave the (fi) 
> ligature, but reject the (tti) one. But this seems to work only for Calibri 
> and not for Roboto!!
> # I think there might be some issues with the font embedding, and some 
> substitution mapping data is lost. It is just a guess, I am not sure how PDF 
> deals with substitutions.
> I know that setting in FO xml:lang to "en" disables the ligatures, but is not 
> a solution for my project. I would appreciate any suggestions.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to