Thank you for reply, but when I use -sort option i have incomprehensive char like this DCC555A 3 5C77 e 577 r777ge oa0000drlu66 6e 1 iI344t nDe 901Dmt .e 753e___efr E CCCanpDDDuta x PPPl:tr p ___ S ty 000000w :osu 000a000p111000s000 P111A r 777u333rB e 222c333RhaA
Note : the document is not encrypted and the font is Arial. Thx again. 2011/1/25 <[email protected]> > > users Digest 25 Jan 2011 12:35:54 -0000 Issue 329 > > Topics (messages 1880 through 1886): > > Re: Parsing Problem > 1880 by: Andreas Lehmkuehler > > Re: Type1C font Error > 1881 by: Andreas Lehmkuehler > > Re: How to draw annotation rectangle in PDF > 1882 by: Andreas Lehmkuehler > 1883 by: prashant mangate > > Parsing Problem : words in disorder > 1884 by: Walid KRIFI > 1885 by: Andreas Lehmkuehler > > NSAutoreleaseNoPool leaking in Tomcat > 1886 by: Alexander Chow > > Administrivia: > > --------------------------------------------------------------------- > To post to the list, e-mail: [email protected] > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > > ---------------------------------------------------------------------- > > > > ---------- Message transféré ---------- > From: Andreas Lehmkuehler <[email protected]> > To: [email protected] > Date: Sun, 23 Jan 2011 12:13:59 +0100 > Subject: Re: Parsing Problem > Hi, > > > Am 21.01.2011 10:49, schrieb Walid KRIFI: > >> Hi All, >> When trying to extract text from PDF file i have extracted words in >> desordre. >> Any idea? >> > Sounds like a missing sort option. See [1] for further details. > > BR > Andreas Lehmkühler > > [1] http://pdfbox.apache.org/commandlineutilities/ExtractText.html > > > > ---------- Message transféré ---------- > From: Andreas Lehmkuehler <[email protected]> > To: [email protected] > Date: Sun, 23 Jan 2011 12:17:09 +0100 > Subject: Re: Type1C font Error > Hi, > > Am 20.01.2011 22:21, schrieb Yogesh: > >> Hi, >> >> I am still getting the error >> >> org.apache.pdfbox.pdmodel.font.PDFontFactory createFont >> WARNING: Failed to create Type1C font. Falling back to Type1 font >> java.io.IOException: The handle is invalid >> > Did you update your local PDFbox copy and recompile it? > > BR > Andreas Lehmkühler > > On 2 January 2011 13:50, Andreas Lehmkuehler<[email protected]> wrote: >> >> Hi, >>> >>> >>> Am 05.12.2010 07:31, schrieb Yogesh: >>> >>> I am getting an IOException, but the StackTrace looks similar. >>> >>>> This does not seem to be resolved yet, or is it? >>>> >>>> PDFBOX-708 is resolved in the current trunk (revision 1054449) >>> >>> >>> BR >>> Andreas Lehmkühler >>> >>> >>> On 5 December 2010 01:05, Hesham G.<[email protected]> wrote: >>> >>>> >>>> Is your problem related to this : >>>> >>>>> https://issues.apache.org/jira/browse/PDFBOX-708 >>>>> >>>>> Best regards , >>>>> Hesham >>>>> >>>>> >>>>> --------------------------------------------- >>>>> Included message : >>>>> >>>>> >>>>> Hello, >>>>> >>>>> >>>>>> I am trying to extract text from a set of PDF files. I keep getting >>>>>> the >>>>>> following error for some of the files. >>>>>> >>>>>> Dec 4, 2010 7:50:19 PM org.apache.pdfbox.pdmodel.font.PDFontFactory >>>>>> createFont >>>>>> WARNING: Failed to create Type1C font. Falling back to Type1 font >>>>>> java.io.IOException: The handle is invalid >>>>>> at java.io.RandomAccessFile.seek(Native Method) >>>>>> at >>>>>> org.apache.pdfbox.io.RandomAccessFile.seek(RandomAccessFile.java:59) >>>>>> at >>>>>> >>>>>> >>>>>> >>>>>> org.apache.pdfbox.io.RandomAccessFileInputStream.read(RandomAccessFileInputStream.java:96) >>>>>> at java.io.BufferedInputStream.fill(Unknown Source) >>>>>> at java.io.BufferedInputStream.read1(Unknown Source) >>>>>> at java.io.BufferedInputStream.read(Unknown Source) >>>>>> at java.io.FilterInputStream.read(Unknown Source) >>>>>> at >>>>>> >>>>>> >>>>>> >>>>>> org.apache.pdfbox.pdmodel.font.PDType1CFont.loadBytes(PDType1CFont.java:429) >>>>>> at >>>>>> >>>>>> org.apache.pdfbox.pdmodel.font.PDType1CFont.load(PDType1CFont.java:318) >>>>>> at >>>>>> >>>>>> >>>>>> org.apache.pdfbox.pdmodel.font.PDType1CFont.<init>(PDType1CFont.java:123) >>>>>> at >>>>>> >>>>>> >>>>>> >>>>>> org.apache.pdfbox.pdmodel.font.PDFontFactory.createFont(PDFontFactory.java:124) >>>>>> at >>>>>> >>>>>> >>>>>> >>>>>> org.apache.pdfbox.pdmodel.font.PDFontFactory.createFont(PDFontFactory.java:76) >>>>>> at >>>>>> org.apache.pdfbox.pdmodel.PDResources.getFonts(PDResources.java:115) >>>>>> at >>>>>> >>>>>> >>>>>> >>>>>> org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:243) >>>>>> at >>>>>> >>>>>> >>>>>> >>>>>> org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:225) >>>>>> at >>>>>> >>>>>> >>>>>> >>>>>> org.apache.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:441) >>>>>> at >>>>>> >>>>>> >>>>>> >>>>>> org.apache.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:365) >>>>>> at >>>>>> >>>>>> >>>>>> org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:321) >>>>>> at >>>>>> >>>>>> org.apache.pdfbox.util.PDFTextStripper.getText(PDFTextStripper.java:241) >>>>>> at litexpr.text.PDFReader.readPage(PDFReader.java:96) >>>>>> at litexpr.Main2.main(Main2.java:51) >>>>>> >>>>>> How can I add these fonts, whatever they are? Please help. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> -Yogesh >>>>>> >>>>>> >>>>>> >>>>>> >>>> >>> >> > > > > ---------- Message transféré ---------- > From: Andreas Lehmkuehler <[email protected]> > To: [email protected] > Date: Sun, 23 Jan 2011 15:52:24 +0100 > Subject: Re: How to draw annotation rectangle in PDF > Hi, > > Am 20.01.2011 08:11, schrieb prashant mangate: > >> Hi, >> >> >> I want to draw the rectangle on a existing PDF as a highlighter. >> >> Existing PDF contains the table. and I want to highlight its cell by using >> following code. But it display over the cell. It should looks like >> transperent. (i.e highlighter) >> >> contentStream.setNonStrokingColor(Color.pink); >> contentStream.addRect(startX, startY+startY, width, height); >> contentStream.fillRect(1.0f, 1.0f, 1.0f, 1.0f); >> > I guess your are looking for a Text markup Annotation. [1] provides some > samples > for different types of annotations. > Have a look at chapter 12.5.6 Annotations types from [2] to learn more > about > annotations. > > BR > Andreas Lehmkühler > > [1] > > http://svn.apache.org/repos/asf/pdfbox/trunk/pdfbox/src/main/java/org/apache/pdfbox/examples/pdmodel/Annotation.java > [2] > http://www.adobe.com/content/dam/Adobe/en/devnet/pdf/pdfs/PDF32000_2008.pdf > > > > > ---------- Message transféré ---------- > From: prashant mangate <[email protected]> > To: [email protected] > Date: Mon, 24 Jan 2011 11:11:05 +0530 > Subject: Re: How to draw annotation rectangle in PDF > Hi, > > Thanks for your kind reply. > > But, i dont want annotation. I want to inherit the annotation feature. > Like, If i draw rectangle with color, so i will be able to set the color > opacity & text behind the rectangle should display. > > > > > > On Sun, Jan 23, 2011 at 8:22 PM, Andreas Lehmkuehler <[email protected] > >wrote: > > > Hi, > > > > Am 20.01.2011 08:11, schrieb prashant mangate: > > > > Hi, > >> > >> > >> I want to draw the rectangle on a existing PDF as a highlighter. > >> > >> Existing PDF contains the table. and I want to highlight its cell by > using > >> following code. But it display over the cell. It should looks like > >> transperent. (i.e highlighter) > >> > >> contentStream.setNonStrokingColor(Color.pink); > >> contentStream.addRect(startX, startY+startY, width, height); > >> contentStream.fillRect(1.0f, 1.0f, 1.0f, 1.0f); > >> > > I guess your are looking for a Text markup Annotation. [1] provides some > > samples > > for different types of annotations. > > Have a look at chapter 12.5.6 Annotations types from [2] to learn more > > about > > annotations. > > > > BR > > Andreas Lehmkühler > > > > [1] > > > > > http://svn.apache.org/repos/asf/pdfbox/trunk/pdfbox/src/main/java/org/apache/pdfbox/examples/pdmodel/Annotation.java > > [2] > > > http://www.adobe.com/content/dam/Adobe/en/devnet/pdf/pdfs/PDF32000_2008.pdf > > > > > > > -- > Thanks & regards > Prashant Mangate (プロシャント・マングテ) > Software Engineer > Softbridge Solutions (India) Pvt Ltd > Unit #103, Tower #S4 Cybercity > Magarpatta City, Hadapsar, Pune 411028 > Mobile: (91) 9421685015 > Email: [email protected] URL: www.softbridge-s.com > > > ---------- Message transféré ---------- > From: Walid KRIFI <[email protected]> > To: [email protected] > Date: Mon, 24 Jan 2011 09:53:12 +0100 > Subject: Parsing Problem : words in disorder > Please Help, > When I parse PDF with PDFBox I have the output text but words are in > disorder. > when i extract text with Acrobat all is gone fine. > > Thx. > > > ---------- Message transféré ---------- > From: "Andreas Lehmkühler" <[email protected]> > To: [email protected] > Date: Mon, 24 Jan 2011 10:48:26 +0100 (MET) > Subject: Re: Parsing Problem : words in disorder > Hi, > > Gesendet: Mo, 24. Jan 2011 > Von: Walid KRIFI<[email protected]> > > > Please Help, > > When I parse PDF with PDFBox I have the output text but words are in > > disorder. > > when i extract text with Acrobat all is gone fine. > Please avoid double postings. I already tried to answer your question > yesterday [1] > > BR > Andreas Lehmkühler > > [1] http://markmail.org/message/twyzamchxqmdgqr5 > > > > ---------- Message transféré ---------- > From: Alexander Chow <[email protected]> > To: [email protected] > Date: Tue, 25 Jan 2011 12:35:16 +0000 > Subject: NSAutoreleaseNoPool leaking in Tomcat > Hi there, > > > I have been playing around with pdfbox to do some PDF processing. If I am > running pdfbox from a standalone Java application, it runs fine. However, if > I used it from within Tomcat, I get these logs: > > > 2011-01-25 12:22:41.485 java[33334:60f] *** __NSAutoreleaseNoPool(): Object > 0x15d200f40 of class NSConcreteMapTableValueEnumerator autoreleased with no > pool in place - just leaking > 2011-01-25 12:22:41.507 java[33334:60f] *** __NSAutoreleaseNoPool(): Object > 0x10063ca70 of class NSConcreteMapTableValueEnumerator autoreleased with no > pool in place - just leaking > 2011-01-25 12:22:41.547 java[33334:60f] *** __NSAutoreleaseNoPool(): Object > 0x100666830 of class NSConcreteMapTableValueEnumerator autoreleased with no > pool in place - just leaking > 2011-01-25 12:22:41.557 java[33334:60f] *** __NSAutoreleaseNoPool(): Object > 0x10063e5c0 of class NSConcreteMapTableValueEnumerator autoreleased with no > pool in place - just leaking > 2011-01-25 12:22:41.602 java[33334:60f] *** __NSAutoreleaseNoPool(): Object > 0x100167b20 of class NSConcreteMapTableValueEnumerator autoreleased with no > pool in place - just leaking > 2011-01-25 12:22:41.617 java[33334:60f] *** __NSAutoreleaseNoPool(): Object > 0x10011fb60 of class NSConcreteMapTableValueEnumerator autoreleased with no > pool in place - just leaking > 2011-01-25 12:22:41.760 java[33334:60f] *** __NSAutoreleaseNoPool(): Object > 0x100677310 of class NSConcreteMapTableValueEnumerator autoreleased with no > pool in place - just leaking > 2011-01-25 12:22:41.765 java[33334:60f] *** __NSAutoreleaseNoPool(): Object > 0x15d24e690 of class NSConcreteMapTableValueEnumerator autoreleased with no > pool in place - just leaking > 2011-01-25 12:22:41.879 java[33334:60f] *** __NSAutoreleaseNoPool(): Object > 0x100644500 of class NSConcreteMapTableValueEnumerator autoreleased with no > pool in place - just leaking > 2011-01-25 12:22:41.887 java[33334:60f] *** __NSAutoreleaseNoPool(): Object > 0x10063ebe0 of class NSConcreteMapTableValueEnumerator autoreleased with no > pool in place - just leaking > > > I figured it was because pdfbox needed to be run in headless mode, so I > tried setting my environment to have: > > > CATALINA_OPTS=-Djava.awt.headless=true > > > > Unfortunately, that didn't seem to help much either. > > > Here's my OS X java --version, if you are interested (it's the latest > update for Snow Leopard): > > > java version "1.6.0_22" > Java(TM) SE Runtime Environment (build 1.6.0_22-b04-307-10M3261) > Java HotSpot(TM) 64-Bit Server VM (build 17.1-b03-307, mixed mode) > > > > Any thoughts on this? > > > > > Cheers, > Alex > > > > -- ------------------ Cordialement Krifi Walid

