[
https://issues.apache.org/jira/browse/PDFBOX-1552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
John Hewson closed PDFBOX-1552.
-------------------------------
Resolution: Not a Problem
The text in this PDF is embedded in lowercase. Same result as Acrobat's Save As
> Plain Text.
> Uppercase letters are read in lowercase manner
> ----------------------------------------------
>
> Key: PDFBOX-1552
> URL: https://issues.apache.org/jira/browse/PDFBOX-1552
> Project: PDFBox
> Issue Type: Bug
> Components: Text extraction
> Affects Versions: 1.7.1
> Environment: Windows XP
> Reporter: Hesham
> Attachments: pdf_with_uppercase_letters.pdf
>
>
> I have a PDF that when I read its contents using PDFBox some uppercase
> letters are being read as lowercase. For example :
> - Word "Testing" is read as "testing"
> - Word "Eve" is read as "eve"
> - Word "Deuteronomy" is read as "deuteronomy"
> Andreas commented on this by: "The pdf uses marked content to replace a
> string (14.9.4 Replacement Text of the PDF specs provides a simple example).
> And yes, PDFBox doesn't support it, yet."
> Please check this 1-page sample PDF.
--
This message was sent by Atlassian JIRA
(v6.2#6252)