Hello ,

While parsing a PDF file using PDFBox v1.2.1 I get a warning. When I googled 
for this warning I see it was a bug and fixed in v0.7.3 ... But it seems to 
still exist.
You can download the PDF file from here : 
http://www.4shared.com/document/BL3eiOu7/expected_hex_character.html

Here is the warning :
WARNING: java.io.IOException: Error: expected hex character and not  :32
java.io.IOException: Error: expected hex character and not  :32
 at org.apache.fontbox.cmap.CMapParser.parseNextToken(CMapParser.java:336)
 at org.apache.fontbox.cmap.CMapParser.parse(CMapParser.java:139)
 at org.apache.pdfbox.pdmodel.font.PDFont.parseCmap(PDFont.java:556)
 at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:390)
 at 
org.apache.pdfbox.util.PDFStreamEngine.processEncodedText(PDFStreamEngine.java:386)

And here is my code :
PDDocument pdfFile = PDDocument.load( "C:\\expected_hex_character.pdf" );
PDFTextStripper stripper = new PDFTextStripper();
String data = stripper.getText( pdfFile );  // The warning occurs here while 
parsing the PDF.


Best regards ,
Hesham

Reply via email to