[ 
https://issues.apache.org/jira/browse/PDFBOX-1552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Hewson closed PDFBOX-1552.
-------------------------------

    Resolution: Not a Problem

The text in this PDF is embedded in lowercase. Same result as Acrobat's Save As 
> Plain Text.

> Uppercase letters are read in lowercase manner
> ----------------------------------------------
>
>                 Key: PDFBOX-1552
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1552
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Text extraction
>    Affects Versions: 1.7.1
>         Environment: Windows XP
>            Reporter: Hesham
>         Attachments: pdf_with_uppercase_letters.pdf
>
>
> I have a PDF that when I read its contents using PDFBox some uppercase 
> letters are being read as lowercase. For example :
> - Word "Testing" is read as "testing"
> - Word "Eve" is read as "eve"
> - Word "Deuteronomy" is read as "deuteronomy"
> Andreas commented on this by: "The pdf uses marked content to replace a 
> string (14.9.4 Replacement Text of the PDF specs provides a simple example). 
> And yes, PDFBox doesn't support it, yet."
> Please check this 1-page sample PDF.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to