Extracted Text of MS Word generated PDFs corrupt
------------------------------------------------

                 Key: PDFBOX-855
                 URL: https://issues.apache.org/jira/browse/PDFBOX-855
             Project: PDFBox
          Issue Type: Bug
          Components: Text extraction
    Affects Versions: 1.3.0
         Environment: All
            Reporter: Hendrik Lescak


Since Revision 1003195 (PDFBOX-828: fixed some issues with positioning when 
extracting or rendering text) the text extraction with PDFTextStripper behaves 
differently for PDF documents generated with the MS Office Word 2007 "Save as 
PDF" Feature. 

For example: The Term "Fachbereichsleiter" changed to "F a c hb e re ic hsle 
ite r" after PDFBOX-828.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to