Hi All, I have no problem extracting text from pdf document using pdfbox-app-1.5.0.jar but found that the format has been lost. Also downloaded fontbox-1.5.0.jar and jempbox-1.5.0.jar but not sure how to use them to improve the format of the extracted text file to be as close to the orginial pdf file as possible.
Are there any good document around on this topic on using recent jars. I found some material from Google but they are either using a much earlier version (0.8) of pdfbox or the explanantion is insufficient to follow. It is not in PDDFBox FAQ. Do you have an archived mailing list I could lookup? Many thanks, Jack

