Hi, Pramod Pradhan schrieb: > thank you for the sample code Andreas... but i am hitting another exception > now. > > I get the below exception when I try using the piece of code provided by > you. can u please help? > > Exception in thread "main" org.apache.pdfbox.exceptions.WrappedIOException > at org.apache.pdfbox.util.PDFStreamEngine.<init>(PDFStreamEngine.java:137) > at org.apache.pdfbox.util.PDFTextStripper.<init>(PDFTextStripper.java:162) > at ExtractText.main(ExtractText.java:230) > Caused by: java.lang.ClassCastException: > org.pdfbox.util.operator.ShowTextGlyph cannot be cast to > org.apache.pdfbox.util.operator.OperatorProcessor > at org.apache.pdfbox.util.PDFStreamEngine.<init>(PDFStreamEngine.java:131) > ... 2 more You somehow mixed up your environment. You have both pdfbox versions in the classpath. All pdfbox classes from the current version have the prefix "org.apache.pdfbox" and your stacktrace shows at least one class with the prefix "org.pdfbox" used in former versions.
BR Andreas Lehmkühler > thanks, > ~pramod > > 2009/10/27 Andreas Lehmkühler <andr...@lehmi.de> > >> Hi, >> >> Betreff: java.io.IOException: expected='startxref' Gesendet: Di, 27. Okt >> 2009 >> >> Von: Pramod Pradhan >> >>> Hi All, >>> I am trying to write a simple to code to just parse the text data from a >> pdf file onto the console.I am hitting the below exception >>> java.io.IOException: expected='startxref' actual='' >> org.pdfbox.io.pushbackinputstr...@100ab23 at >>> org.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:355) at >>> org.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:176) at >> PDFTextParser.pdftoText(PDFTextParser.java:49) at >>> PDFTextParser.main(PDFTextParser.java:93)PDF to Text Conversion failed. >> Looking at the stacktrace your're obviously using an older version of >> pdfbox. I suggest to update to pdfbox 0.8.0. It is available at [1] >> >>> Can someone please help? I have attached the Java class file. >> Your attachment didn't make it because of the mailing list policy. >> If you are looking for an example how to extract text from a pdf, have a >> look at ExtractText [2] >> >> BR >> Andreas Lehmkühler >> >> [1] http://incubator.apache.org/pdfbox/download.html >> [2] >> http://svn.apache.org/repos/asf/incubator/pdfbox/trunk/src/main/java/org/apache/pdfbox/ExtractText.java >> > > >