Hi,

Gesendet: Di, 10. Nov 2009 Von: Aaron Kaplan<lists2...@aaronkaplan.info>

> I checked out pdfbox 0.8.0, built it with ant, and ran the tests.  Six 
> of them are failing:
> 
> Failed tests:
>    testExtract(org.apache.pdfbox.util.TestTextStripper)
>    testRenderImage(org.apache.pdfbox.util.TestPDFToImage)
> 
> Tests in error:
>   
> testProtectionError(org.apache.pdfbox.encryption.TestPublicKeyEncryption)
>    testProtection(org.apache.pdfbox.encryption.TestPublicKeyEncryption)
>  
> testMultipleRecipients(org.apache.pdfbox.encryption.TestPublicKeyEncryption)
> 
>    testParsingTroublePDFs(org.apache.pdfbox.pdfparser.TestPDFParser)
> 
> 
> I looked at the output of TestTextStripper, and most of the differences 
> involve the glyph names circlecopyrt, angbracketleft, and 
> angbracketright, which were removed from glyphlist.txt in this commit:
> 
> http://svn.apache.org/viewvc?view=revision&revision=793058
> 
> So my first question: how should these glyphs be getting resolved now 
> that they're not in glyphlist.txt; or do the tests need to be updated?
We have to add the missing mappings. I've filed an issue in JIRA [1]

> The remaining errors in TestTextStripper are all in the file 
> solidconvertor.pdf .  The expected output file appears to be in UTF-16, 
> but the actual output file is a strange mixture of UTF-8 and corrupt 
> UTF-16.  Second question: any idea why a corrupt output file is being 
> generated?
> 
> I also looked into TestPDFParser and the problem was a missing input 
> file.  I gather from an old mailing list post that it was removed 
> because of copyright problems.
There are some "inofficial" test files. I guess it's one of them.
 
> By this point I was getting the impression that these tests weren't 
> intended for me to run, so I didn't bother trying to figure out what was 
> going wrong in the other cases.  My third question: is it expected that 
> the tests I listed above fail, or are there any that I should look into 
> as potential indicators of bugs?
These tests exist to help us finding bugs after changes. So finally we expect 
that these tests don't fail. We try to increase the number of test pdfs to 
cover as much as possible test cases. But that's not that easy because of the 
known issue concerning the license or the confidentiality of some suitable pdfs.

> Thanks
> -Aaron
Thanks for the reporting.

BR
Andreas Lehmkühler

[1] https://issues.apache.org/jira/browse/PDFBOX-557

Reply via email to