Hello there, > > I was using pdfbox 0.8 version and > PDFTextStripper.processTextPosition(TextPosition text) was called for > every "field"???. With 1.0 it looks like it is calling it for every > character. Could you please tell me how to get it to call only on every > "field". Thank you. >
In short, your PDF document contains a "character spacing" instruction, to which the PDFTextStripper now correctly abides to. The change is detailed here: https://issues.apache.org/jira/browse/PDFBOX-520 Since this change didn't have negative impact on the correctness of the output of PDFTextStripper (quite the contrary!), could you please elaborate what is the downside of this solution for you? A noticeable performance degradation? VR

