[ 
https://issues.apache.org/jira/browse/PDFBOX-2842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14621936#comment-14621936
 ] 

John Hewson commented on PDFBOX-2842:
-------------------------------------

This uses Panose and other font metadata to perform smart substitutions of CJK 
fonts. Each CJK font on the system is scored and ranked against the information 
in the FontDescriptor and the best font is chosen. This really helps with 
PDFBOX-2509.

If this approach proves to be robust we could expand it in the future as a 
generic font substitution mechanism for latin fonts too.

> Overhaul font substitution
> --------------------------
>
>                 Key: PDFBOX-2842
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-2842
>             Project: PDFBox
>          Issue Type: Improvement
>          Components: FontBox, PDModel
>    Affects Versions: 2.0.0
>            Reporter: John Hewson
>            Assignee: John Hewson
>            Priority: Blocker
>             Fix For: 2.0.0
>
>         Attachments: 029423-p1.pdf, 166292-fi-ligature.pdf
>
>
> The improved font substitution mechanisms in 2.0 are not quite sufficient to 
> handle all PDFs. Specifically, CJK substitution and substitution of TTF in 
> place of CFF fonts is not possible with the current design.
> The CJK problems can be seen in PDFBOX-2509 and PDFBOX-2563, which does not 
> solve the problem. Additional font API weaknesses can be found in PDFBOX-2578 
> and PDFBOX-2366. This meta-issue aims to address all of those sub-issues.
> The current problems are:
> - FontBox does not provide a generic font type, so we have handle 
> TrueTypeFont, CFFFont, and Type1Font separately. This hinders cross-format 
> substitution.
> - ExternalFonts has no knowledge of the CIDSystemInfo which is necessary for 
> CJK substitution
> - FontProvider contains too much public logic which should be internal to 
> PDFBox, e.g. substitution logic, this makes it brittle and means we won't be 
> able to add additional logic after 2.0 is released, e.g. CJK substitution.
> - Too much confusion about the role of ExternalFonts, particularly with 
> regards to mapping of built-in fonts and the definition of substitute vs. 
> fallback font.
> - ExternalFonts is a black box: the user cannot tell whether the font 
> returned is an exact match, or a last-resort fallback.
> - Confusing font substitution API, users preferred having a flat file format
> - PDSimpleFont#getEncoding() can return null for TTFs which use built-in 
> encodings. This has caused a lot of bugs - there must be a better way.
> - We still have some confusing names, for example a CustomEncoding is known 
> as a "built-in encoding" in the spec.
> - There is no fallback CFF font, we resort to AdobeBlank instead, which has 
> no rendering.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to