[
https://issues.apache.org/jira/browse/PDFBOX-3024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15041627#comment-15041627
]
Guillaume Monteils commented on PDFBOX-3024:
--------------------------------------------
I agree with you, we should not disable width check for the missing glyph, but
the code do.
Here is the code from Abstract FontContainer :
{code}
public void checkGlyphWidth(int code) throws GlyphException
{
if (isAlreadyProcessed(code))
{
return;
}
try
{
// check for missing glyphs
if (!hasGlyph(code))
{
GlyphException e = new
GlyphException(PreflightConstants.ERROR_FONTS_GLYPH_MISSING, code, "The
character code "
+ code + " in the font program \"" + font.getName()
+ "\" is missing from the Character Encoding");
markAsInvalid(code, e);
throw e;
}
// check widths
float expectedWidth = font.getWidth(code);
float foundWidth = font.getWidthFromFont(code);
checkWidthsConsistency(code, expectedWidth, foundWidth);
}
catch (IOException e)
{
throw new GlyphException(PreflightConstants.ERROR_FONTS_GLYPH, code,
"Unexpected error during the width validation for the
character code " + code +
" : " + e.getMessage(), e);
}
}
{code}
hasGlyph(code) methods return false when character code point at a missing
glyph. If i disable this test, the validation work on my pdf file.
> Preflight validation call PDType0Font.clear at the wrong time
> -------------------------------------------------------------
>
> Key: PDFBOX-3024
> URL: https://issues.apache.org/jira/browse/PDFBOX-3024
> Project: PDFBox
> Issue Type: Bug
> Components: Preflight
> Affects Versions: 1.8.10
> Reporter: Guillaume Monteils
> Attachments: 004973.pdf, PDF-Tools.png, PDFBox.png, eclipse-1.jpg,
> eclipse-2.jpg
>
>
> I used the algorythm here to test PDF / A compliance :
> https://pdfbox.apache.org/1.8/cookbook/pdfavalidation.html
> With one pdf document (which i cant give you due to confidentiality), an
> NullPointerException occur here :
> {code}
> java.lang.NullPointerException
> at
> org.apache.pdfbox.pdmodel.font.PDType0Font.getFontWidth(PDType0Font.java:188)
> at
> org.apache.pdfbox.preflight.font.container.FontContainer.checkGlyphWith(FontContainer.java:114)
> at
> org.apache.pdfbox.preflight.content.ContentStreamWrapper.validText(ContentStreamWrapper.java:372)...
> {code}
> As i dug deeper, i found that preflight loads a font context where it puts
> all pdf fonts. The PDType0Font is also created and put in this context.
> {code}
> (CSObject :
> COSDictionary{(COSName{BaseFont}:COSName{INWHIX+TimesNewRomanPSMT})
> (COSName{DescendantFonts}:COSArray{[COSObject{349, 0}]})
> (COSName{Encoding}:COSName{Identity-H})
> (COSName{Subtype}:COSName{Type0})
> (COSName{ToUnicode}:COSDictionary{(COSName{Filter}:COSName{FlateDecode})
> (COSName{Length}:COSInt{260}) }) (COSName{Type}:COSName{Font}) })
> {code}
> The problem is that at the end of one step of the analysis, the clear method
> is called on the PDType0Font (see eclipse-1.jpg), but the font is still
> present in the context. On a second step, the same font is retrieved from the
> context, with no data in it, and the NullPointerException occurs (see
> eclipse-2.jpg).
> I tried the validation after removing the clear method from PDType0Font and
> it works just fine.
> I think the problem comes from this context, and a clear on a font should
> also trigger a deletion in this map.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]