[ 
https://issues.apache.org/jira/browse/PDFBOX-1661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13709592#comment-13709592
 ] 

Raymond Wu commented on PDFBOX-1661:
------------------------------------

I'm sorry I didn't mention the issue of text extraction.
The text can be extracted by Adobe Reader correctly, and can be extraced by 
PDFBox with such patch too.
The two attachments may be helpful.
Thanks for your reply. ^^
                
> Fix font subtype automatically
> ------------------------------
>
>                 Key: PDFBOX-1661
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1661
>             Project: PDFBox
>          Issue Type: Improvement
>          Components: PDModel
>    Affects Versions: 1.8.1
>         Environment: PDFBox: PDFBox 1.8.1
> Reader: Adobe Reader 11.0.0
> Generator:  TCPDF 4.5.041
> PDF Content:
> <</Type /Font
> /BaseFont /AdobeSongStd-Light,Bold-UniGB-UTF16-H
> /Subtype /Type0
> /Encoding /UniGB-UTF16-H
> /DescendantFonts [27 0 R]
>            Reporter: Raymond Wu
>              Labels: encoding, font
>         Attachments: adobe-screenshot.png, pdf-screenshot.png
>
>
> Subtype is parsed as "Type0" by PDFBox, but parsed as "Type1" by Adobe Reader.
> This is not a bug of PDFBox.
> The reason is TCPDF 4.5.041 generate font AdobeSongStd-Light with bad subtype 
> "Type0".
> It should be "Type1".
> I have test the following codes and they work.
> File: org/apache/pdfbox/pdmodel/font/PDFontFactory.java
> Method: public static PDFont createFont( COSDictionary dic ) throws 
> IOException
> Original:
> else if( subType.equals( COSName.TYPE0 ) )
> {
>     retval = new PDType0Font( dic );
> }
> Fixed:
> else if( subType.equals( COSName.TYPE0 ) )
> {
>     COSName encoding = (COSName)dic.getDictionaryObject(COSName.ENCODING);
>     retval = (encoding!=null) ? new PDType1Font( dic ) : new PDType0Font( dic 
> );
> }
> With such patch PDFBox will act like Adobe Reader.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to