[
https://issues.apache.org/jira/browse/PDFBOX-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tilman Hausherr closed PDFBOX-3134.
-----------------------------------
Resolution: Cannot Reproduce
We're not progressing. I still don't know what PDFBox version you're using, and
I'm beginning to suspect you don't know either. The problem you're describing
does not occur with the attached PDF file. So I'm closing this as "cannot
reproduce". You can still comment or reopen.
If you need help to find out what's going on in your project, maybe tell what
IDE (eclipse, netbeans, intellij) you are using, and how your project is put
together (e.g. maven, ANT, whatever). Do this in the user mailing list as it
has more readers. Hopefully somebody uses the same IDE and can tell you where
to search to find out your version.
> 'Certain' PDF Extraction issue on double letters (i.e. 'ss' ) - drops second
> letter
> -----------------------------------------------------------------------------------
>
> Key: PDFBOX-3134
> URL: https://issues.apache.org/jira/browse/PDFBOX-3134
> Project: PDFBox
> Issue Type: Bug
> Components: Text extraction
> Reporter: Raymond Cabrera
> Attachments: Anton Legore Engineering Resume.pdf, PDFBOX-3134.txt
>
>
> Hi there,
> We have users that are uploading certain PDF files (only happens on some) and
> when the system extracts the text, when there is a double letter word like
> 'Mississauga', it comes up as Misisauga - removes the double letter. This
> seem to only occur on some PDFs. Also, issue is not present when using the
> original Word file.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]