[jira] [Comment Edited] (PDFBOX-2272) Can't extract vertical text correctly

Tilman Hausherr (JIRA) Tue, 14 Jul 2015 11:44:29 -0700

    [ 
https://issues.apache.org/jira/browse/PDFBOX-2272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626841#comment-14626841
 ]


Tilman Hausherr edited comment on PDFBOX-2272 at 7/14/15 6:43 PM:
------------------------------------------------------------------

Here's the change as a patch, just to show that this isn't some bureaucratic 
trick. Hopefully somebody will understand it... I've never worked deeply on 
that part of PDFBox, except two bug fixes (one of them from you).


was (Author: tilman):
Here's the change as a patch, just to show that this isn't some bureaucratic 
trick. Hopefully somebody will understand it... I've never worked deeply on 
that part of PDFBox, except two bug fixes (one of the from you).

> Can't extract vertical text correctly
> -------------------------------------
>
>                 Key: PDFBOX-2272
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-2272
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Text extraction
>    Affects Versions: 1.8.6, 2.0.0
>            Reporter: Biligsaikhan Batjargal
>         Attachments: PDFTextStripper.java, test.pdf, test.txt, vertical.diff
>
>
> - -1.8.6 can't extract the Unicode due to failing to map the UCS2 CMap for 
> 90ms-RKSJ-V.-
> - 2.0 extracts the text but can't handle the vertical layout
> Also see the file from PDFBOX-2294 which contains both horizontal and 
> vertical text.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Comment Edited] (PDFBOX-2272) Can't extract vertical text correctly

Reply via email to