Hi,

> Omid Rashidi <[email protected]> hat am 5. September 2013 um 08:15
> geschrieben:
>
>
> Hi
>
> I want to extract table data of PDF file with PDFBOX in JAVA .
> but removed white spaces more one space when I extract text of PDF .
>
>
>             PDDocument _pd=PDDocument.load("3.pdf");
>             PDFTextStripper _txt=new PDFTextStripper();
>             System.out.println(_txt.getText(_pd));
>
> how to bridle to removing white spaces?
This is just a guess as you didn't provide a sample pdf.

Many pdfs don't contain any white spaces. Most likely all characters are placed
directly
using specific coordinates. To insert some space the pdf an additional offset is
added to
the coordinates.
To sum it up, I'm pretty sure that everything works as expected.

> thanks,
> rashidi

BR
Andreas Lehmkühler
  • White Spaces Omid Rashidi
    • Re: White Spaces Andreas Lehmkühler

Reply via email to