[jira] Closed: (PDFBOX-235) cannot extract japanese text

JIRA Sat, 02 Oct 2010 08:51:59 -0700

     [ 
https://issues.apache.org/jira/browse/PDFBOX-235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Andreas Lehmkühler closed PDFBOX-235.
-------------------------------------

    Resolution: Cannot Reproduce

As s sample pdf is missing it is impossible to repoduce the isse. Set to closed.

> cannot extract japanese text
> ----------------------------
>
>                 Key: PDFBOX-235
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-235
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Text extraction
>            Priority: Minor
>
> [imported from SourceForge]
> http://sourceforge.net/tracker/index.php?group_id=78314&atid=552832&aid=1636410
> Originally submitted by nobody on 2007-01-15 18:56.
> If you parse the attached Japanese pdf file, you get the text but with (I 
> guess) a wrong encoding. Other pdf text stripper extract the text correctly.
> [comment on SourceForge]
> Originally sent by nobody.
> Logged In: NO 
> Hi, 
> I have the same problem.
> The problem has occured because pdfbox don't read descendantfonts(CIDFont) of 
> Type0(Composition) Font and select the standardEncoder.
> [comment on SourceForge]
> Originally sent by nobody.
> Logged In: NO 
> can anyone tell me that....PDFBox can extract japanese pdf???

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Closed: (PDFBOX-235) cannot extract japanese text

Reply via email to