Hello,

I am trying to extract Text from PDFs, mostly scientific literature. Average
number of pages the documents have is 10.
When I run the extraction code, I get text for only the 1st page. For the
rest, I get the following error

Feb 7, 2011 5:18:13 PM org.apache.pdfbox.pdmodel.font.PDSimpleFont
extractToUnicodeEncoding
SEVERE: Error: Could not load embedded CMAP
The handle is invalid

What might be wrong. Please help. Thanks

-Yogesh

Reply via email to