[
https://issues.apache.org/jira/browse/PDFBOX-2384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14149911#comment-14149911
]
John Hewson commented on PDFBOX-2384:
-------------------------------------
Another problem: the TestTextStripperPerformance test doesn't do anything if
the output directories already exist (oops). Also it doesn't actually do any
performance testing, it simply runs PDFTextStripper on the same files as
TestTextStripper does.
> ExtractText should default to UTF-8
> -----------------------------------
>
> Key: PDFBOX-2384
> URL: https://issues.apache.org/jira/browse/PDFBOX-2384
> Project: PDFBox
> Issue Type: Bug
> Components: Utilities
> Affects Versions: 2.0.0
> Reporter: John Hewson
>
> ExtractText (and perhaps also PDFTextStripper) should default to UTF-8, which
> is what most people expect. There have been two long-standing open issues
> PDFBOX-755, PDFBOX-970, because of not using having a good default.
> I've escalated this to a bug, see the first comment.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)