Hi Martin,what version of PDFBox are you using? Did you ever try the sort-option of the ExtractText commandline tool?
Andreas Lehmkühler Martin Obreshkov schrieb:
Hi i want to extract text from a PDF file (Book) and than to index the book content. When i extract the text there are no new lines, tabs , etc .... How can i extract text from pdf and keep the original formatting (mainly for new lines and tabs).
