Hi, > Omid Rashidi <[email protected]> hat am 5. September 2013 um 08:15 > geschrieben: > > > Hi > > I want to extract table data of PDF file with PDFBOX in JAVA . > but removed white spaces more one space when I extract text of PDF . > > > PDDocument _pd=PDDocument.load("3.pdf"); > PDFTextStripper _txt=new PDFTextStripper(); > System.out.println(_txt.getText(_pd)); > > how to bridle to removing white spaces? This is just a guess as you didn't provide a sample pdf.
Many pdfs don't contain any white spaces. Most likely all characters are placed directly using specific coordinates. To insert some space the pdf an additional offset is added to the coordinates. To sum it up, I'm pretty sure that everything works as expected. > thanks, > rashidi BR Andreas Lehmkühler

