Re: Text extracted from only 1st page, not the rest

Yogesh Fri, 11 Feb 2011 08:31:12 -0800

Hi Andreas,

I am using the 1.5.0-snapshot from the trunk.


What might be causing the error?

Thanks

- Yogesh



2011/2/11 Andreas Lehmkühler <[email protected]>

> Hi,
>
> Gesendet: Mo, 07. Feb 2011 Von: Yogesh<[email protected]>
>
> > Hello,
> >
> > I am trying to extract Text from PDFs, mostly scientific literature.
> > Average
> > number of pages the documents have is 10.
> > When I run the extraction code, I get text for only the 1st page. For the
> > rest, I get the following error
> >
> > Feb 7, 2011 5:18:13 PM org.apache.pdfbox.pdmodel.font.PDSimpleFont
> > extractToUnicodeEncoding
> > SEVERE: Error: Could not load embedded CMAP
> > The handle is invalid
> >
> > What might be wrong. Please help. Thanks
> What version of PDFBox are you using? Sounds like an issue which is already
> fixed in the current trunk.
>
> BR
> Andreas Lehmkühler
>

Re: Text extracted from only 1st page, not the rest

Reply via email to