[ 
https://issues.apache.org/jira/browse/PDFBOX-4647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16926791#comment-16926791
 ] 

Tilman Hausherr edited comment on PDFBOX-4647 at 9/10/19 5:20 PM:
------------------------------------------------------------------

The chinese translates to "Inline font parsing does not come out".

The "OpenType Layout tables used in font" log message is not relevant here.

You're missing the text; this is because the "ToUnicode" mapping is missing in 
that font. Try with Adobe Reader, you will not be able to extract it. (It is 
the part with "Boulevard Miguel de Cervantes". The only solution will be OCR, 
e.g. with Apache Tika and Tesseract.

See also

[https://pdfbox.apache.org/2.0/faq.html#text-extraction]

 


was (Author: tilman):
The chinese translates to "Inline font parsing does not come out".

You're missing the text; this is because the "ToUnicode" mapping is missing in 
that font. Try with Adobe Reader, you will not be able to extract it. (It is 
the part with "Boulevard Miguel de Cervantes". The only solution will be OCR, 
e.g. with Apache Tika and Tesseract.

See also

[https://pdfbox.apache.org/2.0/faq.html#text-extraction]

 

> pdf内嵌字体解析不出来  ABCDEE+Arial 字体
> -----------------------------
>
>                 Key: PDFBOX-4647
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-4647
>             Project: PDFBox
>          Issue Type: Bug
>          Components: FontBox, PDModel
>    Affects Versions: 2.0.4
>            Reporter: wanling
>            Priority: Major
>         Attachments: 5e214f828f164322a6600f183191dda5.pdf
>
>
> 报错如下:
> OpenType Layout tables used in font ABCDEE+Arial are not implemented in 
> PDFBox and will be ignored;
> No Unicode mapping for CID+24 (24) in font ABCDEE+Arial
> Adode可以正常查看
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to