Thanks
On Fri, Nov 20, 2009 at 6:11 PM, Patrick Herber <[email protected]> wrote: > Hello > > See perhaps this answer to a similar question: > > http://www.mail-archive.com/[email protected]/msg01812.html > > Regards, > Patrick > > Stephen Haggai wrote: >> >> Hello, >> >> I too tried to extract text from a PDF file but I keep getting these >> errors though the text seems to be fully extracted (not verified >> though). >> >> My code: >> >> import java.io.*; >> import org.apache.pdfbox.pdmodel.*; >> import org.apache.pdfbox.util.*; >> >> public class PDFTest { >> >> public static void main(String[] args){ >> PDDocument pd; >> BufferedWriter wr; >> try { >> File input = new File("C:\\invoice.pdf"); >> File output = new File("C:\\SampleText.txt"); >> pd = PDDocument.load(input); >> System.out.println(pd.getNumberOfPages()); >> System.out.println(pd.isEncrypted()); >> //pd.save("new.pdf"); >> PDFTextStripper stripper = new PDFTextStripper(); >> //String text = stripper.getText(pd); >> wr = new BufferedWriter(new OutputStreamWriter(new >> FileOutputStream(output))); >> stripper.writeText(pd, wr); >> //System.out.println(text); >> if (pd != null) { >> pd.close(); >> } >> } catch (Exception e){ >> e.printStackTrace(); >> } >> } >> } >> >> The "error' or message that I get is >> >> --------------------Configuration: <Default>-------------------- >> 5 >> false >> 20/11/2009 2:17:24 AM org.apache.pdfbox.util.PDFStreamEngine >> processOperator >> INFO: unsupported/disabled operation: g >> 20/11/2009 2:17:24 AM org.apache.pdfbox.util.PDFStreamEngine >> processOperator >> INFO: unsupported/disabled operation: rg >> 20/11/2009 2:17:24 AM org.apache.pdfbox.util.PDFStreamEngine >> processOperator >> INFO: unsupported/disabled operation: RG >> 20/11/2009 2:17:24 AM org.apache.pdfbox.util.PDFStreamEngine >> processOperator >> INFO: unsupported/disabled operation: n >> 20/11/2009 2:17:24 AM org.apache.pdfbox.util.PDFStreamEngine >> processOperator >> INFO: unsupported/disabled operation: re >> 20/11/2009 2:17:24 AM org.apache.pdfbox.util.PDFStreamEngine >> processOperator >> INFO: unsupported/disabled operation: W >> 20/11/2009 2:17:24 AM org.apache.pdfbox.util.PDFStreamEngine >> processOperator >> INFO: unsupported/disabled operation: BI >> 20/11/2009 2:17:24 AM org.apache.pdfbox.util.PDFStreamEngine >> processOperator >> INFO: unsupported/disabled operation: EI >> 20/11/2009 2:17:26 AM org.apache.pdfbox.util.PDFStreamEngine >> processOperator >> INFO: unsupported/disabled operation: m >> 20/11/2009 2:17:26 AM org.apache.pdfbox.util.PDFStreamEngine >> processOperator >> INFO: unsupported/disabled operation: l >> 20/11/2009 2:17:26 AM org.apache.pdfbox.util.PDFStreamEngine >> processOperator >> INFO: unsupported/disabled operation: h >> 20/11/2009 2:17:26 AM org.apache.pdfbox.util.PDFStreamEngine >> processOperator >> INFO: unsupported/disabled operation: S >> >> Process completed. >> >> Is my code wrong somewhere? >> >> Thanks, >> Stephen >> >> >

