[ 
https://issues.apache.org/jira/browse/PDFBOX-3024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14958701#comment-14958701
 ] 

Guillaume Monteils commented on PDFBOX-3024:
--------------------------------------------

During my investigations, i do 2 more that i did not tell you about.

I tested the PDF with the prefligth plugin in Acrobat Pro X, and the document 
was valid as PDF/A.

I also tried running the code with 2.0-SNAPSHOT lib, and the program return the 
following error :
3.3.1 : Glyph error, The character code 0 in the font program 
"INWHIX+TimesNewRomanPSMT" is missing from the Character Encoding

What is strange in this new version is that when i check for character code 0 
(and my font as an identity Cid to Gid Map), the has glyph test cannot answer 
true.

{code}
public class CIDType2Container extends FontContainer<PDCIDFontType2>
{
    public CIDType2Container(PDCIDFontType2 font)
    {
        super(font);
    }

    @Override
    public boolean hasGlyph(int code) throws IOException
    {
        return font.codeToGID(code) != 0;
    }
}
{code}

if i change PDCIDFontType2.codeToGID return -1 in case a problem occurs (out of 
range, no value in cidToGid map) and change CIDType2Container.hasGlyph to test 
font.codeToGID(code) >= 0, then the document is valid.

Hope it helps.

> Preflight validation call PDType0Font.clear at the wrong time
> -------------------------------------------------------------
>
>                 Key: PDFBOX-3024
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-3024
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Preflight
>    Affects Versions: 1.8.10
>            Reporter: Guillaume Monteils
>         Attachments: eclipse-1.jpg, eclipse-2.jpg
>
>
> I used the algorythm here to test PDF / A compliance :
> https://pdfbox.apache.org/1.8/cookbook/pdfavalidation.html
> With one pdf document (which i cant give you due to confidentiality), an 
> NullPointerException occur here :
> {code}
> java.lang.NullPointerException
>       at 
> org.apache.pdfbox.pdmodel.font.PDType0Font.getFontWidth(PDType0Font.java:188)
>       at 
> org.apache.pdfbox.preflight.font.container.FontContainer.checkGlyphWith(FontContainer.java:114)
>       at 
> org.apache.pdfbox.preflight.content.ContentStreamWrapper.validText(ContentStreamWrapper.java:372)...
> {code}
> As i dug deeper, i found that preflight loads a font context where it puts 
> all pdf fonts. The PDType0Font is also created and put in this context.
> {code}
> (CSObject : 
> COSDictionary{(COSName{BaseFont}:COSName{INWHIX+TimesNewRomanPSMT})       
> (COSName{DescendantFonts}:COSArray{[COSObject{349, 0}]}) 
> (COSName{Encoding}:COSName{Identity-H})       
> (COSName{Subtype}:COSName{Type0}) 
> (COSName{ToUnicode}:COSDictionary{(COSName{Filter}:COSName{FlateDecode})      
> (COSName{Length}:COSInt{260}) }) (COSName{Type}:COSName{Font}) })
> {code}
> The problem is that at the end of one step of the analysis, the clear method 
> is called on the PDType0Font (see eclipse-1.jpg), but the font is still 
> present in the context. On a second step, the same font is retrieved from the 
> context, with no data in it, and the NullPointerException occurs (see 
> eclipse-2.jpg).
> I tried the validation after removing the clear method from PDType0Font and 
> it works just fine.
> I think the problem comes from this context, and a clear on a font should 
> also trigger a deletion in this map.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to