Hi,

Gesendet: Mo, 07. Feb 2011 Von: Yogesh<[email protected]>

> Hello,
> 
> I am trying to extract Text from PDFs, mostly scientific literature.
> Average
> number of pages the documents have is 10.
> When I run the extraction code, I get text for only the 1st page. For the
> rest, I get the following error
> 
> Feb 7, 2011 5:18:13 PM org.apache.pdfbox.pdmodel.font.PDSimpleFont
> extractToUnicodeEncoding
> SEVERE: Error: Could not load embedded CMAP
> The handle is invalid
> 
> What might be wrong. Please help. Thanks
What version of PDFBox are you using? Sounds like an issue which is already 
fixed in the current trunk.

BR
Andreas Lehmkühler

Reply via email to