thank you for the sample code Andreas... but i am hitting another exception
now.

I get the below exception when I try using the piece of code provided by
you. can u please help?

Exception in thread "main" org.apache.pdfbox.exceptions.WrappedIOException
at org.apache.pdfbox.util.PDFStreamEngine.<init>(PDFStreamEngine.java:137)
at org.apache.pdfbox.util.PDFTextStripper.<init>(PDFTextStripper.java:162)
at ExtractText.main(ExtractText.java:230)
Caused by: java.lang.ClassCastException:
org.pdfbox.util.operator.ShowTextGlyph cannot be cast to
org.apache.pdfbox.util.operator.OperatorProcessor
at org.apache.pdfbox.util.PDFStreamEngine.<init>(PDFStreamEngine.java:131)
... 2 more
thanks,
~pramod

2009/10/27 Andreas Lehmkühler <andr...@lehmi.de>

> Hi,
>
> Betreff: java.io.IOException: expected='startxref' Gesendet: Di, 27. Okt
> 2009
>
> Von: Pramod Pradhan
>
> >Hi All,
> >I am trying to write a simple to code to just parse the text data from a
> pdf file onto the console.I am hitting the below exception
> >java.io.IOException: expected='startxref' actual=''
> org.pdfbox.io.pushbackinputstr...@100ab23  at
> >org.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:355)      at
> >org.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:176)    at
> PDFTextParser.pdftoText(PDFTextParser.java:49)       at
> >PDFTextParser.main(PDFTextParser.java:93)PDF to Text Conversion failed.
> Looking at the stacktrace your're obviously using an older version of
> pdfbox. I suggest to update to pdfbox 0.8.0. It is available at [1]
>
> >Can someone please help? I have attached the Java class file.
> Your attachment didn't make it because of the mailing list policy.
> If you are looking for an example how to extract text from a pdf, have a
> look at ExtractText [2]
>
> BR
> Andreas Lehmkühler
>
> [1] http://incubator.apache.org/pdfbox/download.html
> [2]
> http://svn.apache.org/repos/asf/incubator/pdfbox/trunk/src/main/java/org/apache/pdfbox/ExtractText.java
>



-- 
thanks,
Pramod Pradhan
(361)228-3989

Reply via email to