Re: Text extracted from only 1st page, not the rest

Andreas Lehmkühler Fri, 11 Feb 2011 01:46:44 -0800

Hi,

Gesendet: Mo, 07. Feb 2011 Von: Yogesh<[email protected]>


> Hello,
> 
> I am trying to extract Text from PDFs, mostly scientific literature.
> Average
> number of pages the documents have is 10.
> When I run the extraction code, I get text for only the 1st page. For the
> rest, I get the following error
> 
> Feb 7, 2011 5:18:13 PM org.apache.pdfbox.pdmodel.font.PDSimpleFont
> extractToUnicodeEncoding
> SEVERE: Error: Could not load embedded CMAP
> The handle is invalid
> 
> What might be wrong. Please help. Thanks
What version of PDFBox are you using? Sounds like an issue which is already 
fixed in the current trunk.

BR
Andreas Lehmkühler

Re: Text extracted from only 1st page, not the rest

Reply via email to