Extracted Text of MS Word generated PDFs corrupt
------------------------------------------------
Key: PDFBOX-855
URL: https://issues.apache.org/jira/browse/PDFBOX-855
Project: PDFBox
Issue Type: Bug
Components: Text extraction
Affects Versions: 1.3.0
Environment: All
Reporter: Hendrik Lescak
Since Revision 1003195 (PDFBOX-828: fixed some issues with positioning when
extracting or rendering text) the text extraction with PDFTextStripper behaves
differently for PDF documents generated with the MS Office Word 2007 "Save as
PDF" Feature.
For example: The Term "Fachbereichsleiter" changed to "F a c hb e re ic hsle
ite r" after PDFBOX-828.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.