Andreas Lehmkühler created PDFBOX-4805:
------------------------------------------

             Summary: Regression in 2.0.19
                 Key: PDFBOX-4805
                 URL: https://issues.apache.org/jira/browse/PDFBOX-4805
             Project: PDFBox
          Issue Type: Improvement
          Components: Text extraction
    Affects Versions: 2.0.19
            Reporter: Andreas Lehmkühler
            Assignee: Andreas Lehmkühler
             Fix For: 2.0.20


Joel Hirsh reported a regression with PDFTextStripper which was introduced with 
2.0.19, see his post on 
[users@|https://lists.apache.org/thread.html/r35b50f5b00a39dcf6e77637e2ff2e097f26c395628ae476ab37b344a%40%3Cusers.pdfbox.apache.org%3E]
 for details.

He can't share the pdf in questions due to privacy but did some debugging and 
found out that PDFBOX-4760 is the case for that regression. I accidentally 
committed some [unrelated code|https://svn.apache.org/r1873653] which leads to 
bad text extraction results. As the code targets some corner cases it didn't 
came up as an issue when running our pre release tests. The issue is limited to 
the 2.0 trunk.




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to