Hello VR,
I read the link you have send me. It is above my understanding of the PDFs and PDFBoxTextStripper. I am trying to parse this content from the PDF. With 0.8, the PDFTextStripper.processTextPosition() was called for every column value(e.g: "Mt. Pleasant, SC 29466-8583"). So I thought I will use the getYDirAdj and getXDirAdj methods to sort them and take the values. Now I do not know where each of those column value end. For eg. How will I know "Mt. Pleasant, SC 29466-8583" is from one "field" if I get one character at a time and setSortByPosition(true) also doesn't work with the processTextPosition(). Could you please tell me if there is a better way of do that. Thank you. Regards, Rekha From: Villu Ruusmann <[email protected]> To: [email protected] Date: 02/19/2010 05:42 AM Subject: Re: PDFTextStripper.processTextPosition Hello there, > > I was using pdfbox 0.8 version and > PDFTextStripper.processTextPosition(TextPosition text) was called for > every "field"???. With 1.0 it looks like it is calling it for every > character. Could you please tell me how to get it to call only on every > "field". Thank you. > In short, your PDF document contains a "character spacing" instruction, to which the PDFTextStripper now correctly abides to. The change is detailed here: https://issues.apache.org/jira/browse/PDFBOX-520 Since this change didn't have negative impact on the correctness of the output of PDFTextStripper (quite the contrary!), could you please elaborate what is the downside of this solution for you? A noticeable performance degradation? VR This e-mail may contain data that is confidential, proprietary or non-public personal information, as that term is defined in the Gramm-Leach-Bliley Act (collectively, Confidential Information). The Confidential Information is disclosed conditioned upon your agreement that you will treat it confidentially and in accordance with applicable law, ensure that such data isn't used or disclosed except for the limited purpose for which it's being provided and will notify and cooperate with us regarding any requested or unauthorized disclosure or use of any Confidential Information. By accepting and reviewing the Confidential information, you agree to indemnify us against any losses or expenses, including attorney's fees that we may incur as a result of any unauthorized use or disclosure of this data due to your acts or omissions. If a party other than the intended recipient receives this e-mail, he or she is requested to instantly notify us of the erroneous delivery and return to us all data so delivered.

