[
https://issues.apache.org/jira/browse/PDFBOX-751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12879791#action_12879791
]
Chris Chadwick commented on PDFBOX-751:
---------------------------------------
Hi, I have asked our customer whether we can include the image or not. In th
meantime can you comment as to whether this issue has been seen before?
> Text Extraction truncates last character when image page has sideways text
> --------------------------------------------------------------------------
>
> Key: PDFBOX-751
> URL: https://issues.apache.org/jira/browse/PDFBOX-751
> Project: PDFBox
> Issue Type: Bug
> Components: Text extraction
> Affects Versions: 1.1.0
> Environment: HP UX 11iV1
> Reporter: Chris Chadwick
>
> When using unsorted text extraction on a PDF that contains a horizontal page
> (normal orienation text) followed by a page where all the text is rotated 90
> degrees (landscape) , the last character of each word is forced onto a new
> line. For example
> Thi
> s
> erro
> r
> wa
> s
> logge
> d
> toda
> y
> It is only the last letter of each phrase that is affected, and it is only
> affected on the rotated page.
> Selecting the text from the image directly - in adobe do 'Select All', cut -
> produces the correct results, as do other tools, so the text layer appears
> correct in the PDF file.
> Also please could you publish when V1.2 be ready as this may resolve this
> issue. Is it available as beta?
>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.