Hello ,
I have a PDF file created using Latex. I am trying to read and print all
letters in that file using PDFBox, but when doing this all spaces in that file
are ignored. Here is the code I am using:
PDPage page = (PDPage)allPages.get( 0 );
PDStream contents = page.getContents();
if ( contents != null ) {
PDFTextStripperProcessor pdfTextStripperProcessor = new
PDFTextStripperProcessor();
pdfTextStripperProcessor.processStream( page, page.findResources(),
contents.getStream() );
}
public class PDFTextStripperProcessor extends PDFTextStripper {
@Override
public void processTextPosition( TextPosition text ) {
System.out.println( text.getCharacter() );
}
}
And you can check a one page file sample here to test it:
https://dl.dropboxusercontent.com/u/10111483/downloads/pdfbox/pdf_latex_spaces_ignored.pdf
What is the cause of this issue please?
Best regards ,
Hesham