Text Extraction truncates last character when image page has sideways text
--------------------------------------------------------------------------

                 Key: PDFBOX-751
                 URL: https://issues.apache.org/jira/browse/PDFBOX-751
             Project: PDFBox
          Issue Type: Bug
          Components: Text extraction
    Affects Versions: 1.1.0
         Environment: HP UX 11iV1
            Reporter: Chris Chadwick


When using unsorted text extraction on a PDF that contains a horizontal page 
(normal orienation text) followed by a page where all the text is rotated 90 
degrees (landscape) , the last character of each word is forced onto a new 
line. For example

Thi
s
erro
r
wa
s
logge
d
toda
y

It is only the last letter of each phrase that is affected, and it is only 
affected on the rotated page.

Selecting the text from the image directly - in adobe do 'Select All', cut  - 
produces the correct results, as do other tools, so the text layer appears 
correct in the PDF file.

Also please could you publish when V1.2 be ready as this may resolve this 
issue. Is it available as beta?
 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to