[ https://issues.apache.org/jira/browse/PDFBOX-533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12758737#action_12758737 ]
Navendu Garg edited comment on PDFBOX-533 at 9/23/09 8:40 AM: -------------------------------------------------------------- Mel, Thanks for implementing writeCharacters method. It is going to save me a lot of time. I tried to use PDFTextStripper2. However, it is giving me the following info/error messages: INFO: unsupported/disabled operation: BDC Sep 23, 2009 10:35:54 AM org.apache.pdfbox.util.PDFStreamEngine processOperator INFO: unsupported/disabled operation: g Sep 23, 2009 10:35:54 AM org.apache.pdfbox.util.PDFStreamEngine processOperator INFO: unsupported/disabled operation: EMC Exception in thread "main" java.lang.ExceptionInInitializerError at org.apache.pdfbox.encoding.EncodingManager.<clinit>(EncodingManager.java:38) at org.apache.pdfbox.pdmodel.font.PDFont.getEncoding(PDFont.java:518) at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:438) at org.apache.pdfbox.util.PDFStreamEngine.processEncodedText(PDFStreamEngine.java:343) at org.apache.pdfbox.util.operator.ShowTextGlyph.process(ShowTextGlyph.java:66) at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:516) at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:229) at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:188) at org.apache.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:367) at org.apache.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:291) at org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:247) at org.apache.pdfbox.util.TestPDFTextStripperPerf.main(TestPDFTextStripperPerf.java:27) Caused by: java.lang.NullPointerException at java.io.Reader.<init>(Reader.java:61) at java.io.InputStreamReader.<init>(InputStreamReader.java:55) at org.apache.pdfbox.encoding.Encoding.loadGlyphList(Encoding.java:98) at org.apache.pdfbox.encoding.Encoding.<clinit>(Encoding.java:58) ... 12 more was (Author: navendugarg): Mel, I tried to use PDFTextStripper2. However, it is giving me the following info/error messages: INFO: unsupported/disabled operation: BDC Sep 23, 2009 10:35:54 AM org.apache.pdfbox.util.PDFStreamEngine processOperator INFO: unsupported/disabled operation: g Sep 23, 2009 10:35:54 AM org.apache.pdfbox.util.PDFStreamEngine processOperator INFO: unsupported/disabled operation: EMC Exception in thread "main" java.lang.ExceptionInInitializerError at org.apache.pdfbox.encoding.EncodingManager.<clinit>(EncodingManager.java:38) at org.apache.pdfbox.pdmodel.font.PDFont.getEncoding(PDFont.java:518) at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:438) at org.apache.pdfbox.util.PDFStreamEngine.processEncodedText(PDFStreamEngine.java:343) at org.apache.pdfbox.util.operator.ShowTextGlyph.process(ShowTextGlyph.java:66) at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:516) at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:229) at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:188) at org.apache.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:367) at org.apache.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:291) at org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:247) at org.apache.pdfbox.util.TestPDFTextStripperPerf.main(TestPDFTextStripperPerf.java:27) Caused by: java.lang.NullPointerException at java.io.Reader.<init>(Reader.java:61) at java.io.InputStreamReader.<init>(InputStreamReader.java:55) at org.apache.pdfbox.encoding.Encoding.loadGlyphList(Encoding.java:98) at org.apache.pdfbox.encoding.Encoding.<clinit>(Encoding.java:58) ... 12 more > PDFTextStripper.writeCharacters is called no where in the class > --------------------------------------------------------------- > > Key: PDFBOX-533 > URL: https://issues.apache.org/jira/browse/PDFBOX-533 > Project: PDFBox > Issue Type: Bug > Components: Text extraction > Affects Versions: 0.8.0-incubator > Reporter: Navendu Garg > Attachments: TestPDFTextStripperPerf.java > > > It seems writeCharacters method is not called anywhere in the PDFTextStripper > class. This makes it impossible for handling character TextPosition as well > as Line Separator because processLineSeparator method is no longer there and > writeLineSeparator is called when actual writing happens. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.